What is the most complicated algorithm made

Ten algorithms that changed science

Thanks to a built-in macro recorder (with which workflows can be recorded in the form of mouse clicks), the extensive compatibility with various file formats and the flexible plug-in architecture, the software can be easily expanded by any user. It is now possible, for example, to track certain objects in videos or to automatically identify biological cells. "In contrast to Photoshop and other programs, ImageJ can do anything you want," says Eliceiri.

7. Sequencing: BLAST (1990)

The cultural relevance of a product can be seen when its name becomes a verb. For web searches, it's Google; for genetics, it's BLAST.

With mutations such as gaps, insertion and deletion, evolution is burned into molecular sequences. By looking for similarities in such sequences - especially of proteins - researchers can discover connections and gain insights into the function of genes. References to this can be found in the rapidly growing biochemical databases.

Margaret Dayhoff made a decisive contribution to this in 1978. She developed a so-called "Point Accepted Mutation" matrix (PAM matrix for short), with which the relationship between two proteins can be assessed not only on the basis of the similarity of their sequences, but also on the basis of their evolutionary distance.

Seven years later, William Pearson at the University of Virginia at Charlottesville and David Lipman at NCBI presented the FASTP algorithm, which linked Dayhoff's matrix with the ability to perform quick searches. In 1990 Liman and some colleagues published an additional extension: the Basic Local Alignment Search Tool (BLAST). It is extremely fast and also finds matches that are evolutionarily further apart. At the same time, it indicates how likely it is that the matches came about by chance.

"It's one of those things that became a verb" (Sean Eddy, bioinformatician at Harvard University in Cambridge, Massachusetts)

The program was incredibly fast, recalls the computer scientist Stephen Altschul from NCBI who was involved at the time. "You could type in your search, have a sip of coffee, and you had the result." It was also easy to use. At a time when databases were being updated by mail, NCBI's Warren Gish set up an email system and later a web-based architecture that allowed users to remotely search the NCBI computers.

The system gave the then burgeoning field of genome biology a new tool, says bioinformatician Sean Eddy of Harvard University in Cambridge, Massachusetts. "It's one of those things that became a verb," ​​says Eddy. "There was talk of BLASTing your sequences."

8. Preprint warehouse: arXiv.org (1991)

In the late 1980s, high energy physicists routinely mailed copies of their submitted research to their colleagues. This was either out of courtesy or because they asked for comments. However, because of the effort, they only sent them to a select few. “Those lower down the food chain relied on celebrity charity. Aspiring researchers at non-elite universities were often completely left out, ”wrote physicist Paul Ginsparg in 2011.

Therefore, in 1991, at the Los Alamos National Laboratory in New Mexico, he developed an e-mail distribution list whose subscribers received lists of preprints every day, each with an article identifier. With a single email, users around the world could submit work, access it from the lab's computer system, receive lists of new articles, or search for specific authors and titles.

Originally, Ginsparg wanted to keep the preprints for three months and limit the content to high-energy physics. But a colleague convinced him to save her indefinitely. "At that moment the concept changed from that of a bulletin board to an archive," he says. Suddenly, articles from distant disciplines arrived. In 1993 Ginsparg migrated the system to the World Wide Web, and in 1998 he named it arXiv.org.

Today, in the 30th year of its existence, the website houses around 1.8 million preprints - all freely accessible - with more than 15,000 monthly submissions and around 30 million downloads. “It's not difficult to see why the service is so popular,” wrote the editors of Nature Photonics on the occasion of arXiv's 20th anniversary: ​​“It offers researchers a quick and convenient way to show what they have done and when , and avoids the hassle and time of peer review in a traditional journal. "

The success led to similar projects in numerous other disciplines, such as biology, medicine or sociology. The effects can currently be read from the tens of thousands of preprints that were created through the coronavirus. "It is gratifying to see that a methodology that 30 years ago was considered heterodox outside the particle physics community is now generally regarded as natural," Ginsparg concludes.

9. Data processing: IPython Notebook (2011)

Fernando Pérez was a PhD student “in search of distraction” in 2001 when he decided to study a core component of Python, a widely used programming language. A system called REPL (Read-Evaluate-Print-Loop) is often used: you enter your code, the program executes it line by line and delivers a result. This worked reliably and quickly, but Pérez noticed that Python was not developed for scientific work. For example, it was not possible to preload certain code modules in a simple way or to keep data visualizations open. So Perez decided to write such a version himself.

The result was IPython, an interactive program package for developing and running Python programs, which Pérez introduced in December 2001 - and which consisted of just 259 lines. Together with the physicist Brian Granger and the mathematician Evan Patterson, he moved the system to the web browser ten years later, brought the IPython notebook to the market and thus triggered a revolution in data science.

Computerized notebooks not only consist of program codes, but also combine text and graphics in one document. In contrast to other systems of this type (such as the software Mathematica or Maple), IPython was open source and allowed contributions from other developers. It also supported the Python programming language, which is very popular among scientists.

In 2014, IPython evolved into Jupyter, which now supports more than 100 programming languages ​​and enables users to process data on remote supercomputers as easily as on their own laptops. "Jupyter has become a standard tool for data scientists," said Nature in 2018. At that time, there were 2.5 million Jupyter notebooks on the code-sharing platform GitHub; meanwhile the number has quadrupled.

These include those notebooks that document the discovery of gravitational waves in 2016 and the mapping of a black hole in 2019. "We are very proud that we made a small contribution to this groundbreaking work," says Pérez.

10. Fast Learner: AlexNet (2012)

Artificial intelligence (AI) comes in different versions. The old school developed artificial intelligence with codified rules; »Deep learning«, which is dominant today, allows computers to »learn« themselves by simulating the neural structure of the thinking organ. For decades, AI researchers dismissed the second approach as nonsensical, explains computer scientist Geoffrey Hinton from the University of Toronto. But in 2012 his doctoral students Alex Krizhevsky and Ilya Sutskever proved the opposite with »AlexNet«.

The venue was ImageNet, an annual competition in which researchers train their algorithms for object recognition with a million images of everyday objects and then test them with a separate data set. By 2012, the best programs wrongly categorized about a quarter of the images, Hinton recalls. The AlexNet neural network reduced the error rate to an astonishing 16 percent.

A large amount of training data, good programming, and the increasing power of graphics processors have led to this success, according to Hinton. "Suddenly we could run our program 30 times faster," he says, "or learn from 30 times as much data."

The real breakthrough had already taken place three years earlier. Back then, his laboratory created a neural network that could process language far better than conventional AI. These advances heralded the rise of deep learning. Such algorithms are the reason why cell phones understand spoken requests and image analysis programs can recognize biological cells. And that's why AlexNet takes a place among the computer programs that have fundamentally changed science - and thus the world - in recent years.