On Thursday, September 21st, we organized a Think Big Session in our San Francisco Offices, with talks by Sebastian Bassi (Biotechnologist & Python Developer), Antonio Fragoso (Data Scientist) and Moacy Barros (Data Architect).
Sebastián Bassi, Biotechnology specialist at Globant, talked about Python for Bioinformatics. He split his talk in 4 topics:
- Industry brief overview
- Intro to Genetic Information
- Intro to Biopython
- Solving 4 issues with Biopython
In the Industry overview he showed some relevant stats and gave an overview of the state of the art. The Genetic Information part was about the basis of genetic code, transcription from DNA to RNA and translation to proteins. In the third part, Biopython classes and methods were the introduced. The last part of the talk was based in real world case scenarios that can be solved using bioinformatic techniques. All techniques are available in the book “Python for Bioinformatics 2nd edition” (available at http://py3.us/)
Antonio and Moacy shared their experience of rebuilding a client’s data pipeline model from a rigid and limited structure to a flexible and scalable architecture. They explained how this transformation led to an automated framework which does not need Data Science Team intervention to be triggered, validated nor measured. The presentation covered how to set all good Software Engineer practices and Data Science knowledge to implement the desired architecture in Java/Spark and how the wind blew to different directions in future phases having to demonstrate flexibility to implement the same standards for Python/Spark projects. Overall, they uncovered the tricky details on how they helped our clients to build flexible and scalable data models.