How we read the sources
We are not a lab and we are not a listicle. Our one job is to read widely enough to know where the genuine agreement on an app sits, and to report that honestly — including when the answer is "it depends" or "the room is split."
The communities we read
For calorie and nutrition apps — our core beat — we read the long-running recommendation and experience threads in r/loseit, r/CICO, r/nutrition, and the app-specific communities such as r/MacroFactor. For other categories we go to the communities most qualified to answer — the dedicated subreddit plus one or two broader rooms where the app is discussed by people without a stake in defending it. We read the high-engagement threads, top-sorted, because heavily-upvoted discussion reflects what a community endorsed, not just what one person posted.
The source types we weigh
- Long-term users. The post that matters most is from the person still logging a year later, not the one reacting on day one. Staying power is the signal we trust most.
- Coaches and practitioners. People who watch many clients succeed or quit see patterns a single user can't. We weight their observations about adherence heavily.
- Hands-on use. We install the app, set it up cold like a new user, and live in it for a few weeks before writing — onboarding friction, logging speed, where the database is thin, which paywall nags show up.
- Independent reviewers. Other careful write-ups, read for where they agree and where they don't.
- Official specs. Pricing, platforms and feature facts come from the source, not from a secondhand summary.
- Research. Where a claim turns on it — for example, the self-monitoring and adherence literature — we read the study itself and cite it, rather than leaning on what a headline said it found.
Our time window
Software moves, so recency matters. We prioritise discussion and testing from roughly the last twelve to eighteen months, and we note when a once-common complaint has been fixed or a loved feature removed. Every piece carries a visible "last reviewed" date and, where the picture has shifted, a short changelog. Our verdicts are stated as consensus as of May 2026, because that's the honest shelf life of this kind of read.
How we resolve conflicting evidence
When sources disagree, we don't average them into mush or pick the tidier story. We ask which source is best placed to know: for adherence, long-term users and coaches outrank a feature spec; for micronutrient accuracy, the data and official documentation outrank vibes. When hands-on experience contradicts community sentiment, we say both and explain the gap. A vocal minority, a brigaded thread, or a vendor's own community can skew a read — we flag those and discount them. And when the disagreement is real and unresolved, we publish it as a split rather than inventing a winner.
What we count as consensus
We use plain-language confidence bands instead of scores, because a number on a moving target reads as objectivity we don't actually have:
- Strong consensus — broad, durable agreement across communities, long-term users and reviewers.
- Mixed consensus — a rough majority view, but credible people land elsewhere for specific reasons we name.
- Genuinely divisive — the room is split; we map the disagreement instead of resolving it.
- A default that stuck — not the newest or loudest, but the option a lot of people switched to over the past year-plus and then kept using.
- Emerging, worth watching — promising and gaining ground, but with a track record too short to crown.
Calling something a default that stuck is a claim about retention over time, not about being the best at everything — and we keep it that narrow on purpose.
Our limitations
A subreddit is not a representative sample of the world; the people who post are more engaged, more opinionated, and skew toward certain demographics. We say so, and we don't treat a loud thread as a referendum. We are not running controlled accuracy tests, so where a question turns on a hard measurement we report what users and official sources say rather than overstating what a forum can establish. We paraphrase recurring sentiment and link the originals so you can check our characterisation; we never invent quotes, usernames or upvote counts. If we're not certain of an exact wording, we paraphrase rather than risk a misquote.
Independence
Our verdicts aren't for sale, and the full disclosure of how the desk is funded lives in one place so we don't perform it on every page: see About the desk. The short version is that app links go to official pages so you can verify facts, and none of them pay us.
Think we misread the room on something? Write to desk@thetestdesk.com with the link and we'll take another look.