Data Context
Fidgrove
Fidgrove
At Fidgrove, we decided to have our platform support rFactor2 for several reasons, including the incredible physics simulation and FFB system, as well as the openness to 3rd party development. It’s great to be able to get a lot of data and interact with it, especially for data geeks like ourselves. Naturally, this openness means that sometimes you might get something that it’s not exactly what you’re expecting, even if that’s what it appears to be at first glance. In order for us to do a good job managing and analyzing this telemetry data, we need to make sure we understand the data we have.
We need to have a context to the data gathered. Simple examples of this include the identification of the car being run, the track layout, and even rFactor 2’s build. Having this type of information is very useful to label data, enabling connections between different data pools. This in turn provides a ground data layer that’s ready for more sophisticated data analysis.
Let us run through the thought process regarding vehicle data. When a user is out on track and we’re storing telemetry data, we need to identify the car associated with it. One would be lead to think that it would be easy to identify a car, for example with a car name shared by the simulation software. Well, that’s not actually the case, as a few different (albeit similar) cars share the same global package name in rFactor 2 (e.g., USF 2000 National vs USF 2000 Championship cars, Radical SR3 left vs right-side drive cars). Still, even if you overcome this challenge and identify the car, can you be sure that’s really the car you expect and that it has not been modified in some way? We can check the physics info on the vehicle to help take that decision. If the physics are the expected, it’s the same car, right? Well, software does evolve, and thus new versions come in and might have tweaked car physics. So, would this be a different car, or the same car? How should we aggregate this in our data platform in a way that we are aware of the potential issue of comparing performance data from different car versions?
In Fidgrove’s platform, all telemetry data has metadata that answers this and other questions. In the specific case of the car identification, we are looking into an array of variables in order to have a match for a car. We also look into version numbers and physics information in order to define an “umbrella car”, which includes all versions of a specific car.
At the moment, we are supporting rFactor 2 content developed by Studio 397 and Reiza content distributed as Studio 397 in Steam. That said, we designed this process so that it follows a few control rules in a dynamic way. New cars that become available on Steam are automatically supported.
Throughout this work, we felt a list of cars and track layouts was missing, and so we created one ourselves, and made it available to rFactor 2’s community here! We’ve also added additional information that was relevant to us when looking at the available content. For example, for cars, we show info such as the car’s primary class, last date the package was updated, content type (free or paid), and engine type (ICE or electric). In addition, we added a simple search and filter feature so that it’s easier to get what one is looking for.
In a nutshell, we defined and implemented a procedure to confidently create context metadata, identifying cars, tracks and rF2 builds being used. All this to the benefit of certainty in all data computing we’re developing. We can’t wait to show you more.