Using AI to analyze the human brain

CAMH and Dell EMC unveil their Neuroinformatics Platform, a superstructure of computation

By Jana Manolakos

The human brain has about 86 billion neurons interconnected with 100 trillion synapses that shoot messages to each other in thousands of configurations. These are big numbers and big data in an organ that takes up only two per cent of your entire body weight, while continuously performing complex neural computations – making for an impressive and mysterious cognitive machine.

Now, a new state-of-the-art computer platform at Toronto’s Centre for Addiction and Mental Health (CAMH) will bring neuroscientists closer to unravelling those very same mysteries, as they work to resolve brain disorders. The system harnesses massive amounts of data from medical imaging, genomics and clinical research.

Late last summer, teams of scientists and systems engineers from Dell EMC and the Krembil Centre for Neuroinformatics at CAMH went live with what can only be described as a behemoth of computation, the CAMH Neuroinformatics Platform. Like the Ontario Brain Institute’s Brain-CODE platform, CAMH’s new tool is a centralized research database that secures, manages and organizes complex data for researchers from within the organization and around the world.

David Rotenberg, Krembil’s Operations Director, says that the idea for launching the million-dollar initiative, which took about a year and a half to develop, was to have a robust, centralized research data management superstructure, which he sees as a critical piece missing in many larger institutions.

“At CAMH, our goal is to understand how the human brain works in health and in disease; and in particular, to understand what the biological underpinnings of these disorders are,” Rotenberg explains. “A lot of the stigma associated with mental illness is that it’s something mysterious; it’s something in the mind. Whereas with other disorders, such as cancer, there are clear biological causes that people understand; and, when you understand the cause, it helps to remove that stigma, but it also helps us move towards treatment.

“Mental health is something that can be understood and it can be solved. By collecting these different types of data, we can look across the scale from the gene to the brain cell to the small circuits, and then to the overall structure of the brain, and then to how that alters the symptoms that we see in the clinic. Data is at the heart of being able to understand the relationship between the hierarchies and where these issues might lie, but also the full spectrum of diagnoses and how they may overlap.”

With 50 research projects well underway since the platform went live, it clearly answers an important call. The system currently holds data from 34,000 researchers and links to 380,000 CAMH patient records, for a whopping 15 terabytes of datasets. With so many records, cyber and data security are critically important for CAMH and its administration.

Participating researchers must sign several stringent agreements before joining, and they fall under the jurisdiction of the CAMH’s Research Ethics Board. Server rooms are under lock and key, and only certain people have access. All systems are password protected and sit behind multiple firewalls, constantly scanned for security breaches, intrusions or viruses. To protect client confidentiality, patient names and addresses are kept separate from their corresponding data.

“You cannot combine the two, so if you had someone’s brain scan you wouldn’t know who they are,” Rotenberg says. The system organizes data in advance, saving researchers from spending weeks and months “digging” to find, organize and make manual links to other datasets. “There are lots of different groups of researchers with different data needs, such as MRI, PET, EEG, epi-genetics, genetics,” Rotenberg explains, “all of which are supported under the same framework.”

In Canada, CAMH is a member of the Canadian Open Neuroscience Platform, which brings together many of the country’s leading scientists in an interactive network of collaboration. It’s something that Rotenberg says will ensure the interoperability of data, something which he sees as vitally important. “Interoperability is a key principle toward bringing institutions together in meeting these challenges rather than being isolated to ask those research questions without the larger data.”

Globally, CAMH works with Europe’s Human Brain Project and Switzerland’s Blue Brain Project. It’s a “natural fit,” according to Rotenberg, because there is mutual interest in understanding the human brain at the cell level and the whole brain level. For new researchers accessing the system, CAMH offers workshops twice a month and quarterly. “We’ve seen trends that show you need to know computation in this day and age, to be able to ask the right questions when faced with these huge volumes of data,” Rotenberg notes. “We train folks on accessing the system, how to use the visualizations and to support data analysis.”

Researchers easily can transfer their data into the platform and integrate it with data from other sources; as Rotenberg says, “This really allows people to share and interact in the same consistent environment no matter where they’re from.” All the data are secured by permissions for specific research groups.

By using the Oracle Virtual Machine (OVM), CAMH adopted a primarily virtualized architecture in which one computer performs like many, or “virtual” computers. In the CAMH platform, these virtual machines contain the operating system and the software, but are not tied directly to the physical machine. The OVM enables flexible deployment, efficient snapshots for backup and simplified fail-over procedures to switch on virtual machines, particularly if hardware were to fail.

“Part of our backup strategy is that we have half of these arrays at one site and the other at the other site of our two campuses,” Rotenberg explains. “We use that for backup and fail-over, so if one site were to go completely down we can use the other site to access all the data. The actual data is backed up every hour on all these systems. We basically take snapshots of the data frequently so if the data were deleted accidentally, we can roll back up to a month.”

To store all the data and backup functions across the two sites, it takes 1.9 petabytes of memory. When you consider that the average mobile phone has about 16 gigabytes, that’s about the same memory you might find if you combined 125,000 mobile phones.

These data are priceless, says Rotenberg, adding that it costs about $1,000 for a typical genetic sequence and as much as $10,000 for some of the newer scans, “so we take very precious care of our data, and I get to sleep at night.”  

Check Also

Sensing Biodiversity: Vacuuming mammal DNA from the air

A geneticist uses a simple pump to filter microscopic genetic samples from air. The approach …