Monday, March 26, 2007

Fink is not as easy as it sounds like

I did not realize that in MacOS there are still lot of things that you can not do easily.

Such as fink. If you can not find fink in a stable list ( I am trying to do it for unstable list), it is not trival to install a packages. Because a lot of time, a package may have dependencies. It is tedious to install them manually one by one.

I am trying to find a way to look for unstable packages (unstable does not means bad, it refers sometimes to "not enough tested" in fink.

First attempt to find "fink list" only has mysql4, but after fink selfupdate and select rsync, it did update the list to use mysql5. But I still can not install biopython, which is in unstable package list.

Things needs to be careful:

1. After default fink install even from the latest version, it still needs to run
fink selfupdate, so the package can udpate by itself.

Sunday, March 25, 2007

Bioinformatics Infrastructure

1. LIMS
I have already listed in a post before

2. Web Interface for customize Program release
We need a web system to allow biologist to access the program we develop through a Web interface. This is to avoid programer to install on different platform. The advanges is that the program can be used easily by other people.
Most of the programs in this category will need input file from users and get result back.
Things need to be done:
a. Web interface
b. Backend data processing
c. Privacy protection.

PISE package is a good starting point: http://www.pasteur.fr/recherche/unites/sis/Pise/

3. Computation Pipelines
a. EST Clustering and analysis
b. Sequence calling and assembly (Phred/Phrap/Consid)
c. Association Studies
d. Data Integration (iProtein)
e. Network and Pathway analysis
f. Structure prediciton and modelling
g. Data submition to public databases (NCBI, PDB, GEO)
h. SNP analysis
i. Workflow (Keppler)

4. Local databases
a. NCBI
b. SRS
c. Human Genome Browser

5. Software tools
a. GeneString (microarray analysis)
b. Spotfire (Multi-dimentional data analysis and visualization)
c. Matlab
d. Mathmatica

6. Project managment tools, Time Management and tracking, CRM

7. Cluster Computing Infrastructure
a. open source software in cluster environment (some of them from ROCK distribution)
b. local databases

8. Computing infrastructure

a. Servers
b. Development environment (Test, Develop, production environment)

9. IT infrastructure
a. Network (DNS, DHCP)
b. Security (Kerberos, Firewall)

Configure My MacPro Book

I was trying to install biopython releated module into my Mac. I am still new to MacOX.

I found Fink to install think. From the first tried, I failed to install bioperl. So it is not as easy as I expected.
I also need to install Apple developer package which require a free registration.

So far I experienced MacOS:

1. Wireless connection is good: reliable but somehow I felt the connection speed is slow.
2. The mouse is hard to click, very rigid.
3. Multi-media is good. Once put a CD in, it can automatically recoginzed it. Same things for DV camera. I tried this on Windows, after install required software, it is still hard to do.
4. The screen is bright, expecially good to work outside
5. No developer program installed by default. Xcode, fink needed
6. Dash board is handy, especially the dictionary
7. I can not use Kingword, which is a English-Chinese dictionary that use mouse to capture the work from screen can translate into Chinese automatically
8. Sound is good.
9. Not every program is available on MacOS. such as camtaria
10. Powerbutton is too obvious. It is not good to have little kids around. My son, 1 year old contantly cause me trouble
11. Email program Entourage is slow

Wednesday, March 21, 2007

template LIMS thoughts

It is apparant that various forms of Laboratory Informaion Management System are needed by different people in UCD campus. It has to be Web based. So I am thinking to estiblish some template systems for different purposes.

Wikipedia defined LIMS as:

A Laboratory Information Management System (LIMS) is computer software that is used in the laboratory for the management of samples, laboratory users, instruments, standards and other laboratory functions such as invoicing, plate management, and work flow automation. A LIMS and a Laboratory Information System (LIS) perform similar functions. The primary difference is that LIMS are generally targeted toward environmental, research or commercial analysis, such as pharmaceutical or petrochemical, and LIS are targeted toward the clinical market (hospitals and other clinical labs).

I found one nice blog about several Web based systems at

http://labsoftnews.typepad.com/lab_soft_news/2006/03/the_first_brows.html

There is also a website has a lot of discussions (http://www.limsfinder.com/)

Most existing system especially commercial ones are for pharmacedutical or biotech to track chemicals and data high throughput instruments. They are not only expensive, but also not what most genomics based biologists.

There are some open source solutions too, like:

Bika: A South Africa company. A system developed on Plone and Python
Website: www.bikalabs.com/

Halx: A system developed for Structural Genomics activities. I like the way they develop their design that seperate presentation and data model.
Website: http://halx.genomics.eu.org/

BASE: It is for Microarray.
Website: http://base.thep.lu.se/

My idea is to see if we can develop different modules that can simplify the process of building customized system. For example:

User admination: User authentication and authorization modules
Data and File management: Data upload and download
Billing
Communications
Report
Visulization
Backend bulk upload