polyglot-tools-docs

Future plans

This is not an exhaustive list - I have a Trello full of tasks and ideas! But I thought it'd be worth sharing things I'm thinking of doing.

I'm also keen to get feedback - if you think I should focus on some areas of this more than others, feel free to contact me - my contact details are on korny.info

Short-term / small things

Improving this website

This site is nice for getting up and running quickly, but I keep running into little inconveniences using someone else's theme - for instance I was going to add Disqus comments, but it got fiddly making it play nicely with the theme. I probably need to move the site to a plain Gatsby site - it's not that hard to do!

Fixing the Coupling logic

Coupling is really a proof-of-concept at the moment, it needs work.

Probably as a minimum, I should ditch the "put everything in 24-hour buckets" approach - it leads to far too many false positives, especially on active projects where some files get changed daily.

Supporting mercurial or other VCSs

Firefox code uses mercurial not git - and while git is pretty dominant, legacy codebases often use something else. It'd be nice to have scanner logic for other kinds of VCS!

Medium term / big things

Build my own Voronoi treemap library

The current Javascript one works, but it has bugs and my workarounds are ugly. Also, it'd be really nice to have one that was fast enough to use in real time!

My ideal would be to have a rust algorithm, and then compile it to WebAssembly so it could be called from the browser.

However, I made some attempts to port it to rust, and hit some solid walls. The underlying algorithm is based on (Nocaj and Brandes 2012) which is great, but the algorithms for convex hulls and the like are all heavily dependant on mutation of data structures, which are non-trivial in rust. It'd be easier to do a complete rewrite.

Read more research

There's a lot of cool research out there - I'd like my plans to be based at least somewhat on what research seems to be showing is relevant, rather than my gut feel.

More and different things to visualise

There are so many things that could be shown! I'm learning more as I read the research, but some things from my list:

  • Show user information in more ways - impact of individuals, teams, how people move between teams
  • Allow user editing - especially merging identities if someone uses multiple emails
  • Show ownership more clearly - distinguish between where 5 people changed a file equally, and where one person made 90% of those changes
  • Show more clearly how users change over time - research talks about the impact of new people on a team, we should be able to visualise that

Scanner improvements

  • ~Following renames - the scanner sees file renames, but it doesn't do a great job of matching earlier changes to the final file name!~ - fixed in release 0.2.0!
  • More tests - I started with a lot of unit tests, but I haven't always added tests for newer features, especially as some need quite complex test data setup
  • Test suites - I want to craft some git repos with real test scenarios, e.g. multiple authors, PRs, rebases. This is quite tricky to do! But it'd improve testing a lot. One option is to base tests on a real-world open-source codebase, but it might make tests a bit fragile.

Some day maybe

Better historical views

At the moment the date slider is a bit deceptive - a lot of metrics are based on the current status of files, so looking at an older time interval doesn't really show what the world was like then.

This could be fixed, but it'd be a lot of work! The scanner would need to look at all git deletes, and reconstitute the state of the main branch at each point in time. Also probably you'd need to then calculate all other metrics at regular (monthly? yearly?) snapshots. I'd have to consider if this was worth the effort - though being able to scan backwards through metrics over time would be awesome...

Explorer tests

I'm a big fan of testing. The Explorer however has no tests at all! It'd be good to look into ways to test my javascript - probably in a few stages:

  1. Add unit tests for all the raw javascript functions - this should be easy!
  2. Add tests for the complex D3 areas - still should be unit testable, but might be more fiddly
  3. Add journey tests for the overall UI - might not actually be worth it.

Better file parsing

I really like keeping things simple - I want the parsing to always work, even for plain text files / obscure languages no-one has looked at.

However, it did occur to me that I could get more language information - method / class lengths, say - from another class of file parser that is widespread: syntax highlighting code. Github has code to highlight a vast range of programming languages, as do libraries like prism.js. I've even seen tools which hook into Visual Studio Code's engine to highlight code: https://github.com/andrewbranch/gatsby-remark-vscode - something like this might provide a way to get more insights into code, without writing thousands of language parsers...

Other visualisations

There are lots of other things I'd like to explore - especially more detailed git history visualisations. This will probably have to wait until I have a need to do these as part of my day job!

Edit this page on GitHub