How a Simple Game Helps Solve One of Biology's Toughest Puzzles
In the digital age, we often hear about how playing computer games wastes time, but what if it could help advance scientific research? Imagine contributing to crucial genetic research during your coffee break.
This isn't science fiction—it's the reality of Phylo, an innovative project that transformed one of computational biology's most complex problems into an engaging puzzle game accessible to everyone. By harnessing the pattern-recognition powers of the human brain, scientists have found a way to improve the accuracy of DNA sequence alignments, potentially helping identify the sources of genetic diseases like breast cancer 1 7 .
Phylo turns complex DNA alignment problems into engaging puzzles that anyone can solve.
Thousands of volunteers contribute to scientific discovery through gameplay.
To understand the significance of Phylo, we must first look at one of the most fundamental techniques in molecular biology: comparative genomics. Scientists compare the genomes of various species to decipher our DNA and identify genes crucial for life 7 .
A sequence alignment is a way of arranging DNA, RNA, or protein sequences to identify regions of similarity. These similarities can reveal functional, structural, or evolutionary relationships between species 7 . From a high-quality alignment, biologists can infer shared evolutionary origins, pinpoint functionally important sites, and, more importantly, trace the source of certain genetic diseases 1 7 .
The problem is that multiple sequence alignment (MSA) is what computer scientists call an NP-hard problem 1 . This means that as the number of sequences increases, the computational time and power required to find the perfect alignment grow at a staggering, prohibitive rate.
While powerful algorithms exist to create alignments, they rely on shortcuts and heuristics. These methods are fast but do not guarantee a globally optimal solution 1 7 . Given that the human genome alone consists of roughly three billion base pairs, achieving a perfect alignment for multiple species is beyond the capacity of even the most powerful supercomputers using traditional methods 7 .
The computational complexity of multiple sequence alignment grows exponentially with the number of sequences, making it practically impossible for computers to find perfect solutions for large datasets.
Faced with this computational bottleneck, researchers at McGill University proposed a radical solution: citizen science. They recognized that while computers struggle with this specific NP-hard problem, the human brain is "exquisitely tuned" for visual pattern recognition 1 . We can instantly spot similarities and differences in a complex visual field—a task that remains incredibly difficult for machines.
The key idea was to abstract the complex MSA problem into a game of manipulating colored shapes, allowing solutions to benefit from innate human capabilities 7 . This approach taps into the billions of "human-brain peta-flops" of computation spent every day playing games 1 .
Launched in November 2010, Phylo is a human-based computing framework that applies "crowdsourcing" to solve the MSA problem 1 . The project takes data that has already been aligned by a heuristic algorithm and allows users to optimize places where the computer may have failed.
All alignments used in Phylo contain sections of human DNA that have been linked to various genetic disorders 7 . This means that every puzzle solved by a player isn't just a game—it's a genuine contribution to scientific research that could potentially help unravel the mysteries of human disease.
So, how do you turn a complex genetics problem into a casual game? The process is ingenious:
The starting alignments come from the UCSC Genome Browser and involve promoter regions of disease-related genes from up to 44 vertebrate species 1 .
Scientists identify short alignment regions with low confidence scores—sections the computer algorithm likely misaligned 1 .
In the game, DNA bases (A, C, G, T) are represented by different colored squares. Each row of colored tiles represents the DNA sequence from a different species 1 .
The goal is to slide the blocks of tiles left or right to maximize the color-matching between rows, avoiding gaps where possible 1 .
While the player sees a colorful puzzle, their actions have a direct biological meaning. The game's scoring system is based on a phylogenetically-aware scoring scheme 1 , which means that matches between more closely related species are weighted more heavily, reflecting their evolutionary relationships.
When a player submits a solution, it is automatically sent back to a central server, evaluated, and, if it improves upon the original, can be re-inserted into the global alignment as an optimization 1 . This creates a powerful feedback loop where human intuition complements computational power.
Player solutions → Server evaluation → Alignment optimization → Scientific discovery
| Biological Concept | Game Representation | Player's Action |
|---|---|---|
| DNA Nucleotides (A, C, G, T) | Colored Tiles | Recognize and match colors |
| Sequences from Different Species | Rows of Tiles | Align tiles across all rows |
| Evolutionary Relationships | Scoring Weights | Prioritize matches in closely related species |
| Insertions/Deletions (Indels) | Gaps between Tiles | Minimize gaps while maximizing matches |
The success of the Phylo experiment exceeded expectations. Since its launch, the platform received more than 350,000 solutions submitted from over 12,000 registered users 1 . This massive volume of human computation yielded tangible scientific results.
Most importantly, the solutions submitted by players contributed to improving the accuracy of up to 70% of the alignment blocks considered in the study 1 . This is a significant improvement, demonstrating that combined with classical algorithms, crowd-computing techniques can successfully enhance the accuracy of multiple sequence alignments.
Registered Users
Solutions Submitted
Improved Alignments
| Metric | Result | Significance |
|---|---|---|
| Registered Users | > 12,000 | Demonstrated public willingness to participate in scientific research |
| Solutions Submitted | > 350,000 | Generated a massive volume of human-computed data |
| Improved Alignments | Up to 70% | Proven effectiveness at solving the core scientific problem |
| Species Aligned | Up to 44 | Addressed a complex, multi-species genomic challenge |
The Phylo project relied on a clever combination of biological data, computational infrastructure, and human intelligence.
| Tool or Resource | Function in the Project |
|---|---|
| UCSC Genome Browser Alignments | Provided the initial, sub-optimal multiple sequence alignments to be improved 1 7 |
| Phylogenetic Scoring Scheme | Allowed the game to biologically weight the importance of matches between different species 1 |
| Flash/JavaScript Game Interface | Abstracted the complex biological problem into an accessible, casual puzzle for web users 1 |
| Citizen Scientists (Players) | Provided the powerful pattern-recognition capabilities of the human brain to find better alignments 1 |
| Central Server & Database | Collected, evaluated, and stored player solutions for re-integration into the genomic record 1 |
The success of Phylo also led to the development of Open-Phylo, a customizable crowd-computing platform that allows other researchers to apply the same powerful approach to their own multiple sequence alignment challenges 7 , ensuring that this innovative fusion of human and computer capabilities will continue to drive genomic discovery for years to come.
Phylo stands as a landmark project in the world of citizen science. It successfully demonstrated that an NP-hard computational problem could be embedded in a casual game easily played by people without significant scientific training 1 . More importantly, it proved that our everyday activities, like playing games, can be channeled to produce meaningful scientific progress.
The implications are vast. The project serves as a powerful model for harnessing distributed human intelligence to tackle problems that still baffle our most advanced computers. As the line between entertainment and science continues to blur, initiatives like Phylo pave the way for a more collaborative future in research—one where anyone with an internet connection and a few minutes to spare can help crack the complex codes of life itself.
Join the citizen science revolution and contribute to scientific discovery!