This article is not a tutorial. This article is for people like me who think that they need to use D3 to visualize their data and have absolutely no idea what they are getting themselves into.
“There are known knowns. There are things we know that we know. There are known unknowns. That is to say, there are things that we now know we don't know. But there are also unknown unknowns. There are things we do not know we don't know.” – Donald Rumsfeld
Known Knowns
It's nice to have a roadmap before you start driving, even if you don't have all the landmarks pinpointed. Data visualization is really easy to do badly. Think of the following as things you should definitely know before you start.
What are the basics?
Unless you’re already a data wizard, consider reading Data + Design, an excellent guide that distills data concepts into something more palatable for the rest of us, starting with how raw data should be collected, working through understanding the results and cleaning the data, and discussing some basics of data visualization.
Do you have data? If the answer is no, that’s not ideal, but it’s also not a total blocker. Unless you are using a public API or the data collection phase has been completed before you start building your visualization, it’s likely you’ll be working with a test set. Working with test data comes with some challenges that need to be addressed before diving in:
- Do you have control over the structure of the data, or will you need to parse it?
- Where will your data ultimately come from?
- Will you need to write a function that provides random test sets in order to ensure your chart can handle edge cases within likely thresholds?
What are you trying to communicate?
Your data is telling a story. Whether it’s a series of charts on a dashboard relaying server statistics, or a choropleth map illustrating regional trends, your visualization should do the heavy lifting of data interpretation.
The limits to creative interpretation of numbers is seemingly endless, especially with a powerful tool like D3. You may have been inspired by:
As utterly stunning many of those visualizations are, it’s possible a simple line, pie, or bar chart will work charmingly for what you need to do. Think about your audience and what you want them to understand about the data. Many people understand that line charts are often used to show trends over time, that pie charts represent percentages of a whole, and that bar charts compare categorical data.
Complex visualizations may require additional explanation. If your audience is a casual reader, it might not make sense to introduce them to their first chord diagram, unless you take the time to walk them through it.
Why do people use D3?
D3 is like having access to an industrial workshop full of impressive and dangerous machines. It is the most flexible option; you have the most control, and you can make it as complicated as you want. Each D3 module has a toolset of complex helper methods that can:
- Fetch and analyze your data.
- Process your data by cleaning and formatting it.
- Handle complex animations, interactivity, and dynamic data.
To do things properly, you will not only need to learn how to use the tools, you will also need to educate yourself on what the tool does before you can determine if it’s the tool for the task at hand. That makes for a rather steep learning curve. For an interesting perspective on how people should be thinking about D3, read D3 is not a Data Visualization Library. If the title of that article seems a little confusing, that’s really just the tip of the D3 iceberg.
If you simply don’t have the time or inclination to learn how D3 works before starting your project, that’s certainly an option. It just might not be such a great idea.
What does D3 not do?
D3 does not do a lot of hand-holding, and it is not for beginners. You might think of other data visualization options as being more similar to buying and building Ikea furniture. D3 does not provide pre-cut and drilled visualization solutions.
There are many examples of visualizations built in D3, but they are not templates with instruction manuals, and they’re mostly built using outdated versions of D3.
Should you use D3?
There are a lot of data visualization libraries out there. There are a lot of ways to visualize data. Research the options. While you do this, think about your general skill level, look at actual examples of the code, and consider how it will work in the environment you want to use. Spending a few hours on this will hopefully save you from having an experience like discovering that the library you chose does not work in your environment.
It’s quite possible that D3 is the best option for you. In the interest of due diligence, ask yourself some questions first:
Known Unknowns
If you hop straight into D3’s documentation, you may find yourself overwhelmed by the rather exhaustive list of modules available. After spending a few minutes poking around the various functions of D3, you may ponder the choices that led you to this moment.
Never fear! Much of D3 exists within the realm of the known unknowns. The tricky part is determining which unknowns are worth knowing, making those unknowns known, and forgiving yourself for the things you simply cannot be bothered to know. If you seriously want to stick with D3, and you need your visualization done yesterday, there are some things that might help you get through it.
Don’t reinvent the wheel any more than necessary.
Simply searching for the functionality that you want within the context of D3 will turn up a lot of working code examples. A lot of people share their d3 projects on github as gists, which you view on bl.ocks.org.
According to my browser’s history, it apparently took me over a thousand google searches to avoid learning D3 through an online course. To complete my dynamic, interactive charts and visualizations, I essentially frankensteined a bunch of features together from a variety of sources.
When the recycled code does not work?
A warning about recycling code: pay attention to the version number. D3 has been around for a while, and you should be able to identify the version in most code examples. If you want to stick with the newest version of D3, but the tasty code snippet you desire is two versions out of date, it’s quite possible it will not work for you without some modification. D3 had a pretty major structural overhaul between versions 3 and 4.
I approached this two ways; either by looking through the D3 changelog, or simply by sifting through the current documentation. A lot of fixes involved minor tweaks, like changing d3.scale.linear to d3.scaleLinear.
Drop some example code into your environment.
This might seem obvious, but it’s something I neglected to do and ended up causing some headaches as a consequence. Getting D3 running in your environment could potentially be more challenging than you expect. It might be that you were planning on doing this all along, because where else would you build out your data visualization?
For me, I decided to build my visualizations in CodePen to avoid waiting for my javascript to compile. There were no API endpoints for me to use, and CodePen gave me feedback much more quickly as I stitched D3 into the shape I desired. I ran into problems with rendering in React, and by using fetch, some issues with Internet Explorer 11 bubbled up. To some degree, this is just what life using WebPack, React, and Babel is like, and hindsight is 20/20, but I could have discovered these issues earlier in the game, which might have led to more scrutiny about whether or not D3 was the right library to use.
Unknown Unknowns
Working in an environment as varied as our current web development ecosphere means that it is possible that you will run into problems with solutions that are not a quick google search away. Like learning and using any technology, there are a few things that are nice to have, especially when it comes to anticipating your blind spots.
You are not alone
No person is an island! Other developers are a tremendously valuable resource, and some of them might even like D3 enough to enjoy talking about it with you. If I had taken the time to reach out to my developer community before hurtling into the project at full speed, they might have been able to warn me about some of the pitfalls ahead. I very innocently had no clue (that might explain over a thousand google searches with D3 in the query). Ask them what their experience was like, the challenges they had, and about where they went to look for answers when things went pear-shaped.
Consider failure
Success is not guaranteed. A lot of us are trained to believe that if we want it hard enough and we work enough hours, we will overcome all challenges with graceful solutions. If you start thinking about what failure looks like earlier in the game, you might be able to identify some weak points in your strategy. What happens if the data is bad? What if the story you want to tell with your data cannot be communicated effectively through your visualization? What if you have a hard deadline that you cannot meet?
If you can shelve some of your assumptions, you may find yourself open to more flexible solutions. Perhaps D3 is not the right tool for the project you have in mind, but your research brings you closer to understanding when it is perfect for something else. Perhaps D3 is exactly what you need, and that research will help you avoid spinning your wheels.
Learning and working in D3 was an excellent experience for me and probably the correct tool for the work that I needed to do, but if I could go back in time and do more preliminary research, my search history might have only been 500 Google searches.