Inzending en samenvatting van mijn aanpak voor de Maven Lego Challenge (upd. 16/02/24 finalist)
“For the Maven LEGO Challenge, you’ll need to stack your imagination and analytical prowess to piece together an interactive dashboard or visual that lets users explore the history and evolution of LEGO sets from the past 5 decades.”
The dataset contains urls to images. I like images of LEGO sets, specifically the sets released in the early 70’s. The LEGO sets I dreamt of. I decided that I was the user and what I wanted most was to query the data, see a result grid and hover over cards to see the details. And find back some of the LEGO I played with.
The data (Powerquery)
I cleaned the data and:
- divided all the age fields into a custom field with “recommended age’ based on the current age groups LEGO uses.
- dropped the theme group field and merged the theme and subtheme together in an extra column. If querying is the goal and you want to add the LEGO set “name” for a like-wise search, you have to decide what to leave out to avoid endless filterfields.
- divided all the “pieces” fields into a custom field with ranges.
- created dim tables for release period, recommended age, number of pieces, my combination of theme and subtheme and category.
- replaced values in pieces and agerange which were empty with “Unknown”.
Despite the datacleaning sometimes strange results occur. There are books with more than one piece. The question is: was it a set with a book and is it in the wrong category OR was it a book and has it too many pieces OR something else.
- The image in the upper left corner was generated with a GPT Tattoo creater. Dall-E can generate nice images but spelling, o my.
- The searchfilters are horizontally placed. I actually liked vertically more, because the grid would be partially visible but horizontal is easier when you wonder what you selected.
- Top right an information button for access to some search tips & clear filter button.
- The release visual is actually two visuals. One visual with the number of releases per year (white line) and fixed (I hope, I tried) with the same visual (transparent) on top showing the numbers of found results for the query (yellow). I added this visual because for many selections it gives a quick impression of proportion. E.G. the part of bookreleases in the total, the releases specifically for 18+ compared to the total, etc.
- Some numbers about the search results are mentioned next to the release visual. You can not leave it all out, or can you? Most of it is textboxes with inserted Measure results.
- The grid with cards is a custom visual, Multi Info Cards.
- A hidden page has the details about a card and acts as a tooltip.
I left out minifigures because I saw too many LEGO sets with a number of minifigures in the data which contradicted the image shown. And I never liked minifigures. I left out the prices because it is US price on release and I had the feeling you would easily slip into “comparing apples to oranges” mode.
Datacleaning en modellering