Last summer, I wrote a TRB paper in which piece-wise multiple linear models (PMLMs) were developed to predict pavement marking retroreflectivity in Minnesota, Pennsylvania, and Florida. The primary goal of the paper was to develop robust models that can predict retroreflectivity of pavement marking materials under various winter weather conditions.
In the paper, I compared the proposed model with conventional multiple linear models (MLMs), which is a popular type of regression models used to predict pavement marking retroreflectivity. The way I made the comparison was to visualize the difference in prediction-observation plots. The tool I used to create the graphics was ggplot2. As shown in the following figures, the horizontal axis depicts the observed retroreflectivity and the vertical axis denotes the predicted retroreflectivity. Each point in the graphic represent one pair of prediction and observation. If the point falls on the 45-degree dotted line, it has a prediction that is the same as the observation. If the point is above the 45-degree line, it has higher prediction than observation. On the contrary, if the point is below the 45-degree line, its prediction is lower than observation. The points are coded in three half-transparent colors to enable comparison in each state.
Figure 1. Predicted versus Observed Retroreflectivity (Yellow Methyl Methacrylate on Concrete Pavement)
Figure 1 compares the results obtained from the MLM (left) and the proposed PMLM (right) to predict retroreflectivity of methyl methacrylate on concrete pavements. From this figure, it is clear that the proposed models performed much better than the MLM. For example: (1) PMLM points are more aligned to the 45-degree line; (2) PMLM points are closer to the 45-degree line, indicating better accuracy; (3) MLM preformed extremely poor when the observed value was low; PMLM performed consistently no matter the value of the observation.
This type of visualization is not only useful when you show a pair of comparison, it can also be a powerful way of communication when you show multiple pairs of comparison. For example, Figure 2 shows 4 pairs of comparison between MLM and PMLM under different striping color and pavement types. From this figure, it is obvious that PMLMs performed better than MLMs in most cases, indicating that the proposed method indeed provide a robust way to predict pavement marking retroreflectivity under different weather conditions, line colors, and pavement surface types.
Figure 2. Predicted versus Observed Retroreflectivity (Methyl Methacrylate)
To show how Plotly works, I recreated the two graphics in Figure 1 in Plotly. As shown in Figure 3, these two graphics now become completely interactive online graphics that can be used to get into more in-depth discussions. Here are some examples of the functions provided by Plotly: (1) If you move your mouse cursor onto any point in the graphic, a small textbox will pop up and show you the values of that point. (2) If you crop a specific region, you will zoom in to that region. To zoom back out, simply double click the graphic. (3) To view only certain datasets, click the legend that you want to hide. For example, to compare only Pennsylvania results, you can click on FL and MN in both graphics to hide them. Similarly, you can click them again to show those points.
Figure 3. Recreating Figure 1 in Plotly
I was very excited to realize that there are so many tools that make communicating ideas and scientific analysis much easier and effortless. I have always been a big fan of ggplot2 and now I am also a big fan of Plotly. I plan to use Plotly more often especially when I am developing ideas that need to brainstorm with my advisor or colleagues.
For those interested to learn more about these two tools, I plan to write a short tutorial with detailed codes to show you how to create these figures. So stay tuned!