arrow_backAll posts
datapricingguide

Pricing Fortnite Accounts Directly From Locker Screenshots: A First Experiment

PriceMyGame Team7 min read

Every Fortnite seller proves their locker the same way: with screenshots. A grid of skins, a screenshot of the in-game locker, sometimes a whole gallery the buyer is expected to scroll through. Our pricing model has been ignoring all of it — it works from the seller's typed description.

This post is our first experiment at fixing that. We trained a small model that reads the screenshots directly, identifies the cosmetics in them, and learns what each item is worth. The numbers below are early. The dataset is small. We're publishing them because the rankings are already interesting, and because the path forward is clear.

Treat this as a research log, not a launched feature.

Why screenshots matter more than descriptions

We already showed in the locker-proof post that around 1 in 3 listings link to an external image host, and bigger lockers tend to come with image proof. What that post didn't dig into: among listings that do describe their inventory in text, only a fraction list outfit names cleanly enough that we can match them against a cosmetic database.

Across the early pilot batch we OCR'd, only about 1 in 5 listings had a description that yielded a parseable list of skin names. For the other 4 in 5, the screenshots are the only objective record of what's actually in the account.

In other words: image-based pricing isn't a nice-to-have. For most listings, it's the only way to score the inventory at all. That's what motivated this experiment.

Reading the locker — the OCR engine we picked first

For the first experiment, we wanted something we could realistically run in production without standing up a GPU. We tried PaddleOCR's PP-OCRv5 — a CPU-only pipeline that fits in roughly 150 MB on disk and runs on a regular VPS. Cosmetic names are short, in a fixed font, and laid out on a grid. That's exactly the kind of input where a small dedicated OCR model can do well.

PaddleOCR
OCR Engine
PP-OCRv5, CPU-only, ~150 MB on disk
82%
Overall F1
on hand-labeled locker tiles
79%
Description vs OCR
median item overlap — sanity check

The naive "feed the whole screenshot to the OCR" approach didn't work. PaddleOCR's text detector internally rescales images to a fixed shape, so a wide locker grid loses small text along the way. What worked, after benchmarking eight preprocessing variants on a hand-labeled gold set, was tiling each screenshot into a 2×2 grid with a small overlap and bicubically upscaling each piece by 2× before OCR. That preserved aspect ratios, cut down on adjacent-label merging in dense rows, and recovered around twelve percentage points of accuracy compared to the baseline.

OCR Accuracy by Cosmetic Category
F1 score on a hand-labeled gold set — higher is better
Emotes
91%
Skins
82%
Gliders
81%
Pickaxes
72%

Emotes are the easiest — long names survive partial mis-recognition because fuzzy matching can still reach the right entry. Pickaxes are the hardest: small fonts on dual-blade icons routinely fool the detector. Skins and gliders sit comfortably in between.

The F1 numbers above are scored against a small set of locker tiles we hand-labeled — every item in every image, by category. That's the only honest way to measure OCR on a long-tail vocabulary, and it's the bottleneck that will determine how confident the eventual pricing tool can be.

Cross-signal sanity check

For the listings where we have both a parseable description and an OCR-derived item list, the two sets of skins agree on 79% of items at the median. That's the smell test that tells us OCR is reading the locker correctly — not just hallucinating names that look like skins, but actually identifying the same cosmetics the seller wrote down. Encouraging enough to justify the next step of the experiment.

It's also a hint at what the eventual production model should look like: descriptions and OCR each catch a few items the other misses, so combining the two signals is going to beat either one alone.

From items to price — a deliberately simple first model

For the model itself, we kept things blunt on purpose: one binary feature per unique cosmetic name (across skins, pickaxes, gliders, and emotes), and a Lasso regression on the log of the listing price. Lasso shrinks irrelevant features to zero automatically, which is exactly what we need at this stage — the dataset is small, the feature space is huge (close to 1,900 cosmetics appearing in at least three lockers), and we don't yet know which items carry signal and which are noise.

Out of those ~1,900 candidate items, the model kept 47 with non-zero weight. Here's what they look like.

Items the First Model Most Associates With Higher Prices
Estimated price effect when the item shows up in a locker — early experiment, small sample
Blue Squire
+45%
Flawless
+32%
Victor's Flail
+27%
Onslaught
+25%
Sparkle Spec.
+24%
Axecaliber
+22%
Skull Trooper
+20%
Billy Bounce
+19%
Tech Axe
+18%
Lil' Saucer
+17%
Royale X
+15%
Havoc
+13%
Master Key
+12%

The top of that list reads like an OG-Chapter-1 greatest hits. Blue Squire and Sparkle Specialist are Season 2 Battle Pass skins, Skull Trooper is the Halloween 2017 Ghoul Trooper companion, Havoc is the Twitch Prime exclusive that ended in 2018, and Flawless is from the Save the World Founder's Pack era. The model discovered all of this on its own — we didn't tell it which items were OG, only which lockers had which items.

The pickaxes it surfaced (Victor's Flail, Onslaught, Axecaliber, Tech Axe) are Chapter 1 reward axes that show up disproportionately in higher-value lockers. The emotes (Billy Bounce, Lil' Saucer) are similar — Season 2 and Season 3 originals that ride along with OG accounts.

Items That Drag the Estimated Price Down
Common cosmetics the first model treats as low-value markers
Hope
-11%
Nanner Bashers
-9%
Renegade
-8%
Make a Seat
-8%
Sledgecracker
-7%
Destroyer's D.
-7%
Bushranger
-6%
Crowning Ach.
-5%
Default Pickaxe
-5%

The negative side is just as readable: Default Pickaxe dragging the price down is exactly what you'd expect (it's the placeholder for accounts that haven't equipped anything else), and Hope, Bushranger, and Crowning Achievement are common cosmetics that cluster on lower-priced accounts.

One curiosity worth flagging: Renegade comes out as a negative. At face value that's wrong — Renegade Raider is one of the rarest skins in the game. But the OCR reads "RENEGADE" the same way regardless of whether it's the rare Renegade Raider or the much more common free Renegade. This is exactly the kind of real-world wrinkle a first experiment is designed to surface, and it points at one of the next steps: marrying OCR with the cosmetic-database lookup more carefully so we can tell variants apart.

What this first experiment showed

The model fits well in-sample — it explains roughly two-thirds of the variation in log-prices on the data it was trained on, with an average error of about $49 against a median listing price of $100. On held-out folds, that performance falls apart: out-of-fold R² hovers around zero. At this scale, the rankings are real but the dollar predictions don't generalize yet.

0.66
In-sample R²
signal is real on the training set
≈ 0
Out-of-fold R²
too few accounts to generalize yet
47
Items Auto-Selected
out of ~1,900 candidates

That gap between in-sample fit and out-of-fold generalization is the single most important number on this page. It's not a sign the approach is wrong — it's a sign the dataset is small. With more screenshots in training, the model has more chances to see each individual item across many different account contexts, and the noise on each item's coefficient shrinks accordingly. The bottleneck is screenshot coverage, not the modeling choice.

So: treat these item rankings as observations from a first experiment, not as production prices. The list above is interesting precisely because it lines up with what the marketplace already knows — Chapter 1 skins, OG pickaxes, Season 2 emotes. The model rediscovered "OG = expensive" purely from looking at lockers, which is the result we wanted from this step.

What's next

The clearest next experiment is just more data. We're scaling up the screenshot collection so the same training pipeline runs on multiples of the current pilot. Past a certain point, the out-of-fold numbers should start to look much more like the in-sample fit.

Two other directions we want to try:

  • Combine signals. Use OCR for the bulk of the inventory and the description text as a tie-breaker, especially for items like Renegade vs. Renegade Raider where the icon disambiguates what the text can't.
  • Plug into the existing model. Our current pricing model and the per-skin marginal-value model already work well from typed input. Adding OCR-derived locker contents as new features should improve both, especially for the listings where the description leaves the inventory blank.

If the rest of the experiments hold up, this all eventually shipped as a button on the calculator: drop a locker screenshot, get a price. We're not there yet — this is just the first step.

In the meantime, the calculator already prices accounts from typed inventories, and we'll keep writing about each step of the image-pricing build the same way we did the first and second models. The next post in this series will be the bigger-data follow-up.