Peter Wills Bioinformatic Centre

Cansto


One of the consequences of using tissue microarrays to assess many tumor markers on multiple samples is the voluminous quantity of data that are generated in each experiment. For example, in a single experiment using a medium-density prostate cancer tissue microarray that incorporates between 2 to 4 specimens of cancer sampled from 150 RP specimens to examine expression of a single tumor marker by IHC would generate approximately 400 separately stained samples for each gene examined. Increasingly, information management systems designed to efficiently archive and manage TMA data and images are being utilized for molecular pathology studies. These information management systems comprise a database, usually implemented on a network server separate from the client computers, and an application that runs on the client computer or through a web interface. The application embodies the analytical requirements of the scientists who use the system.

At our Institute, a database was implemented using the relational database management system, Sybase. An application, called CanSto was written using the RealBasic development environment. CanSto runs on any PC or Macintosh computer in the network. The database is populated and managed by the scientists using the CanSto application. The database is also available for analysis through a web interface, which is particularly useful for the international collaborators who take part in the research. A general purpose query application, GQA, is used by the scientists to build and save database queries. GQA can run against all databases designed within the Institute’s information architecture. GQA gives the scientists the freedom to analyse their data in any way as it constructs standard Structured Query Language (SQL) queries and runs them against the database. The scientist does not need to understand the complexity of SQL as GQA builds and saves the queries for them. GQA allows information on any given patient, gene, histological diagnosis or tissue, separately or in combination, can be retrieved simply.


The CanSto application allows the scientists to collect an H&E image and up to 2 images of the stained specimen for each experiment performed on the TMA as well as patient ID, array coordinates, array name, block, tissue, tissue source, histological diagnosis and pathology of each specimen arrayed on the TMA. All data entered by the scientists using CanSto are stored in the Sybase relational database. Images associated with cells of the TMA are stored on a central file server. The files are moderately compressed images in jpeg format. The location of the image file, and the cell that the image is associated with, are stored in the database. The CanSto application works with the Sybase database and the images files to seamlessly present a unified system to the scientist.

 


It is typical for data in a research institute to be collected using several different mechanisms, often implemented in the prevailing technology at the time collection was commenced.. Ideally, unified systems are built to encompass all information management needs, but frequently there is a need to bring together data from separate sources. Our information architecture allows for this. In particular, we have a Filemaker database of disease-specific patient outcome data and other clinical and pathological variables of prognostic relevance from the patients from whom tissue is taken and analysed on the TMA. These data can be brought into the CanSto data model and matched automatically by unique patient ID. This allows a GQA query to return not only the TMA data, but also the TMA data matched to clinical outcome details. Thus, the TMA data can be easily correlated with clinical endpoints. The use of unique patient ID ensures that the data is anonymous and cannot be traced back to patients.

 



Another database model was described recently by Manley et al. (2001), that utilizes a relational database structure created in Microsoft Access 2000 (Microsoft, WA) to manage (1) clinical and pathology data, (2) TMA location information, and (3) web-based histology results. The TMA component of these databases is comprised of data from 336 prostate cancer patients transferred into 19 TMA blocks with 5451 TMA biopsy cores [Manley, 2001 #379]. A feature of this model is the integration of a customized imaging system (BLISS; Bacus Labs. Inc., IL and Prostate SPORE Tissue Microarray Working Group) that automatically captures high-resolution composite images of the TMA samples, and assigns a unique name that acquires a digital image and assigns a unique name to each image of each sample acquired from the array. The images are then linked to an image database that is designed to permit image viewing, entering and editing over the Internet by authorized users, facilitating collaborative research.

Cansto is built with REALbasic and released as a native binary (Win32, MacOS 8/9 and MacOS X) and is also accessible via a web interface. The data is warehoused in Sybase hosted on Solaris technology.


©2003 PWBC