Independent · No affiliate links · No sponsors
The Test Desk Consensus reviews · apps
The desk

How we read the sources

We are not a lab and we are not a listicle. Our one job is to read widely enough to know where the genuine agreement on an app sits, and to report that honestly — including when the answer is "it depends" or "the room is split."

The communities we read

For calorie and nutrition apps — our core beat — we read the long-running recommendation and experience threads in r/loseit, r/CICO, r/nutrition, and the app-specific communities such as r/MacroFactor. For other categories we go to the communities most qualified to answer — the dedicated subreddit plus one or two broader rooms where the app is discussed by people without a stake in defending it. We read the high-engagement threads, top-sorted, because heavily-upvoted discussion reflects what a community endorsed, not just what one person posted.

The source types we weigh

Our time window

Software moves, so recency matters. We prioritise discussion and testing from roughly the last twelve to eighteen months, and we note when a once-common complaint has been fixed or a loved feature removed. Every piece carries a visible "last reviewed" date and, where the picture has shifted, a short changelog. Our verdicts are stated as consensus as of May 2026, because that's the honest shelf life of this kind of read.

How we resolve conflicting evidence

When sources disagree, we don't average them into mush or pick the tidier story. We ask which source is best placed to know: for adherence, long-term users and coaches outrank a feature spec; for micronutrient accuracy, the data and official documentation outrank vibes. When hands-on experience contradicts community sentiment, we say both and explain the gap. A vocal minority, a brigaded thread, or a vendor's own community can skew a read — we flag those and discount them. And when the disagreement is real and unresolved, we publish it as a split rather than inventing a winner.

What we count as consensus

We use plain-language confidence bands instead of scores, because a number on a moving target reads as objectivity we don't actually have:

Calling something a default that stuck is a claim about retention over time, not about being the best at everything — and we keep it that narrow on purpose.

Our limitations

A subreddit is not a representative sample of the world; the people who post are more engaged, more opinionated, and skew toward certain demographics. We say so, and we don't treat a loud thread as a referendum. We are not running controlled accuracy tests, so where a question turns on a hard measurement we report what users and official sources say rather than overstating what a forum can establish. We paraphrase recurring sentiment and link the originals so you can check our characterisation; we never invent quotes, usernames or upvote counts. If we're not certain of an exact wording, we paraphrase rather than risk a misquote.

Independence

Our verdicts aren't for sale, and the full disclosure of how the desk is funded lives in one place so we don't perform it on every page: see About the desk. The short version is that app links go to official pages so you can verify facts, and none of them pay us.

Think we misread the room on something? Write to desk@thetestdesk.com with the link and we'll take another look.