H — 001 · Hackathon
— Open Call —
Active · May mmxxvi

Coherence Filter.

An open hackathon for the defense of open communities. Compute for whoever delivers the framework that lets them hold.

Let’s see how open source can truly defend itself.

Open communities are increasingly attacked. The cost of producing plausible automated participation has fallen below the cost of producing a thoughtful human one. Communities that depend on shared attention — for governance, for research, for the slow work of building things in public — are losing the signal of who is actually present. This hackathon is a serious attempt to give them back the means to know.

What follows is the brief: the threat model, a reference architecture submissions can engage with or replace, an evaluation rubric formalized, the submission shape, the prize structure, the window. Read it carefully if you intend to enter. Read it carefully also if you intend to argue that the entire premise is wrong. Either is welcome.

§ I

The Threat

Open communities accumulate four distinct pressures, each leaving different traces. A defense framework that catches one cleanly will miss the others. The hardest case is not the obvious bot.

Transactional spam.

Phishing links, fake-giveaway DMs, scam impersonations of moderators or administrators. Visible payload makes these the most documented and the easiest to detect; existing tooling catches most of them. They are mentioned here for completeness, not because they are interesting.

Account farming.

Accounts age in the background — joining hundreds of communities, idling for weeks, then activating in coordinated waves for spam, vote brigading, or sale. The signal lives in the temporal pattern: server-join velocity, time-to-first-message distribution, correlated activation events across cohorts that arrived together.

Engagement automation.

Language models have made it cheap to produce plausible-but-shallow conversation. An engagement bot can pass a five-message Turing test but fails sustained reciprocity: it does not form dyads, does not remember exchanges across days, does not escalate or de-escalate emotional register in response to context. The signal lives in the interaction graph and in the temporal coherence of stated commitments.

Extractive presence.

Real humans who join solely to scrape conversations for training data, harvest contact lists for cold outreach, or recruit members away. No spam, no rule violation, no contribution. The signal lives in the asymmetry: high read activity, low post activity, surgical engagement only with high-status members.

A community filled with the third pressure is statistically alive and substantively dead. That is the failure mode this framework must name and prevent.

§ II

A Reference Architecture

A defense framework should compose three layers, each independently inspectable. This is a reference. A submission may replace any layer with a stronger alternative. A submission may also reject the decomposition and propose a different one, provided the alternative satisfies the architectural commitments below.

The Signal Layer

Raw features extracted from the available record. None of these require user-private content; all can be derived from public message history and account metadata.

Account metadata.

Account age, prior server count, name and avatar entropy, badge and role presence, verification state. Useful as priors but never decisive in isolation — legitimate new members exist.

Join behavior.

Time-to-first-message after joining. Time-of-day distribution of activity. Server-join velocity (this account joined N servers in the last 24 hours). Correlated arrival cohorts.

Linguistic signal.

Embedding distance between an account’s posts and the community’s baseline distribution. Repetition rates across servers (the same phrasing posted in multiple unrelated communities). Template detection. Response coherence under sustained back-and-forth.

Interaction graph.

Reply patterns. Reciprocity rates. Dyad and triad formation. Ratio of broadcast posts to addressed messages. Whether sustained mutual conversation forms across multiple sessions.

Temporal coherence.

Burst patterns versus continuous activity. Correlation of activity spikes with public events, announcements, or moderator actions (legitimate) versus with cohort triggers (suspicious).

Cross-community signal.

The same account behaving identically in N other shared communities, assessed via privacy-preserving message fingerprints rather than raw content. Submissions that handle this layer well are particularly valuable.

The Decision Layer

Classifications produced from the signal layer. The decision layer must produce structured output a moderator can read, not opaque scores.

Per-message.

Classification across (clean, spam, scam, uncertain). Used for immediate actions on visible payload.

Per-account.

A probability vector across (genuine, automated, scammer, extractive, adversarial), with confidence. Used for moderation review and for the population coherence metric in § III.

Per-cohort.

Identification of accounts that arrived together and have correlated activation patterns. Used for catching account farms before individual accounts cross any threshold on their own.

Mod-in-the-loop.

The decision layer never auto-bans. It surfaces flagged accounts to moderators with confidence and reasoning, ranked by suspicion. The moderator decides. Reversibility is a hard requirement.

The Commitment Layer

Architectural commitments the framework must satisfy. These are not features; they are conditions on what counts as a defense.

Inspectability.

Every decision must produce a reason a human moderator can read in plain language. “Account X flagged as automated” is insufficient. “Account X posts at uniform 47-second intervals during business hours, has zero reciprocal exchanges in 90 days, and its message-embedding distribution is two standard deviations from the community baseline” is sufficient.

Tunability.

Thresholds, category weights, and signal layer composition must be configurable per-community. A research Discord and a gaming Discord have different baselines for what coherence looks like.

Reversibility.

Every action is logged. Every category labeling is editable. Every decision is undoable. A framework that produces unrecoverable mistakes has not understood the problem.

Extensibility.

New attack patterns will emerge. The framework must accept new signal types and new attack categories without rebuilding from scratch.

Openness.

Weights, code, and methodology must be releasable. If this is open source defending itself, the defense must be open. Submissions that require proprietary models or closed APIs will be evaluated, but at a substantial discount.

These commitments are not aesthetic preferences. They are the conditions under which the framework can be trusted by the communities that deploy it.

§ III

The Rubric

Evaluation is not a single number. It is the joint distribution over three quantities, and submissions are compared by Pareto-improvement rather than by scalar score.

Direct accuracy.

Precision and recall against labeled bot, scammer, and extractive accounts in held-out segments of the available record. Both matter; neither alone is sufficient. A framework with 90% precision and 30% recall is not better than one at 75/75.

Population coherence.

The post-filter rate at which surviving members exhibit reciprocal sustained interaction patterns — mutual replies, dyad and triad formation, conversation continuity across sessions. The starting baseline is roughly 25 in the Third Space Discord; the honest target sits in the 50s. That delta is the rubric’s structural anchor.

False-positive harm.

The rate at which legitimate but atypical members get flagged. Weighted higher than false-negative cost. Better to let some bots through at the early stages than to exile members who simply communicate differently. A framework whose false positives include neurodivergent communicators, non-native English speakers, or first-time-active lurkers fails this layer regardless of its accuracy elsewhere.

Submissions are evaluated against held-out segments of the existing Discord history, against a forward-deployed window after submission, and across records from other open communities that have agreed to share their data for evaluation purposes. Cross-community generalization matters: a framework that overfits to the Third Space Discord is less interesting than one that transfers.

Submissions that propose new evaluation methodologies are welcome and will be considered alongside their substantive frameworks. The rubric above is a starting position, not a final word.

§ IV

What to Ship

A submission contains five components.

Code.

The framework itself, with reproducible build. Public repository preferred. License: anything OSI-approved.

Methodology document.

What the framework does, in what order, with what assumptions. What signal layers it composes and how the decision layer combines them. What the framework does NOT attempt to catch — explicit limitations are credited, not penalized.

Evaluation results.

The framework run against the released evaluation corpus, with metrics across all three rubric layers. Confusion matrices broken down by attack category. Population-coherence trajectory before and after filter application.

Inspection demo.

A working example of the inspectability commitment: for several example accounts in the corpus, the framework’s plain-language reasoning. This is the artifact a moderator would actually use.

Limitations statement.

What kinds of attacks does this framework miss? What kinds of legitimate behavior might it misclassify? What assumptions break down in edge cases? Submissions that name their failure modes honestly are evaluated more favorably than submissions that overstate their reach.

Submission channel

Pull request to the public submissions repository (link to be posted in the Third Space Discord), or direct upload via the #hackathon-submissions channel. Coordinate with Stanley directly for access to the evaluation corpus, which contains pseudonymized message records and is shared under a use-restricted license.

§ V

Prize and Window

Prize

Compute, comparable to the results delivered. Specifically: an allocation on Third Space’s training infrastructure, scaled to the framework’s evaluated quality. Top-tier submissions may receive allocations exceeding what Third Space currently maintains for its own work. This is not a marketing claim. The ceiling is high, and the higher it goes the more directly it expresses how seriously this problem is taken.

Multiple complementary frameworks may receive multiple smaller allocations rather than a single grand prize, if together they solve the problem better than any one alone. The point is the defense, not the leaderboard.

Window

Open as of May mmxxvi. No fixed close. The honest deadline is the arrival of a framework that meets the rubric. The window extends six months on submissions that warrant the extension. There is no version of this hackathon that ends because the calendar said so.

Resolution

Run independently by Third Space. Resolved when a winning framework is identified and deployed in the Third Space Discord, with allocations distributed transparently and the winning code released openly.

Run independently. The window extends six months on entries that warrant it. Let’s see how open source can truly defend itself.

Join the Discord Return to bulletin

Stanley Sebastian & Claude·Third Space
H — 001 · Coherence Filter · May mmxxvi
This brief is a starting position, jointly authored. It is meant to be argued with, improved upon, and replaced.