Contibuting to DeepChem as a Scientist¶
The scientific community in many ways is quite traditional. Students typically learn in apprenticeship from advisors who teach a small number of students directly. This system has endured for centuries and allows for expert scientists to teach their ways of thinking to new students.
For more context, most scientific research today is done in “labs” run in this mostly traditional fashion. A principal investigator (PI) will run the lab and work with undergraduate, graduate, and postdoctoral students who produce research papers. Labs are funded by “grants,” typically from governments and philanthropic agencies. Papers and citations are the critical currencies of this system, and a strong publication record is necessary for any scientist to establish themselves.
This traditional model can find it difficult to fund the development of high quality software for a few reasons. First, students are in a lab for limited periods of time (3-5 years often). This means there’s high turnover, and critical knowledge can be lost when a student moves on. Second, grants for software are still new and not broadly available. A lab might very reasonably choose to focus on scientific discovery rather than on necessary software engineering. (Although, it’s worth noting there are many exceptions that prove the rule! DeepChem was born in an academic lab like many other quality projects.)
We believe that contributing to and using DeepChem can be highly valuable for scientific careers. DeepChem can help maintain new scientific algorithms for the long term, making sure that your discoveries continue to be used after students graduate. We’ve seen too many brilliant projects flounder after students move on, and we’d like to help you make sure that your algorithms have the most impact.
The answer to this really depends on what you’re looking for out of your career! Making and maintaining good software is hard. It requires careful testing and continued maintenance. Your code will bitrot over time without attention. If your focus is on new inventions and you find software engineering less compelling, working with DeepChem may enable you to go further in your career by letting you focus on new algorithms and leveraging the DeepChem Project’s infrastructure to maintain your inventions.
In addition, you may find considerable inspiration from participating in the DeepChem community. Looking at how other scientists solve problems, and connecting with new collaborators across the world can help you look at problems in a new way. Longtime DeepChem contributors find that they often end up writing papers together!
All that said, there may be very solid reasons for you to build your own project! Especially if you want to explore designs that we haven’t or can’t easily. In that case, we’d still love to collaborate with you. DeepChem depends on a broad constellation of scientific packages and we’d love to make your package’s features accessible to our users.
While DeepChem was born in the Pande lab at Stanford, the project now lives as a “decentralized research organization.” It would be more accurate to say that there are informally multiple “DeepChem PIs,” who use it in their work. You too can be a DeepChem PI!
I want to establish my scientific niche. How can I do that as a DeepChem contributor? Won’t my contribution be lost in the noise?¶
It’s critically important for a new scientist to establish themselves and their contributions in order to launch a scientific career. We believe that DeepChem can help you do this! If you add a significant set of new features to DeepChem, it might be appropriate for you to write a paper (as lead or corresponding author or however makes sense) that introduces the new feature and your contribution.
As a decentralized research organization, we want to help you launch your careers. We’re very open to other collaboration structures that work for your career needs.
Yes! DeepChem’s core mission is to democratize the use of deep learning for the sciences. This means no barriers, no walls. Anyone is welcome to join and contribute. Join our developer calls, chat one-on-one with our scientists, many of whom are glad to work with new students. You may form connections that help you join a more traditional lab, or you may choose to form your own path. We’re glad to support either.
Not yet, but we’re actively looking into getting grants to support DeepChem researchers. If you’re a PI who wants to collaborate with us, please get in touch!
Yes! The most powerful features of DeepChem is its community. Becoming part of the DeepChem project can let you build a network that lasts across jobs and roles. Lifelong employment at a corporation is less and less common. Joining our community will let you build bonds that cross jobs and could help you do your job today better too!
One of the core goals for DeepChem is to build a shared set of scientific resources and techniques that aren’t locked up by patents. Our hope is to enable your company or organization to leverage techniques with less worry about patent infringement.
We ask in return that you act as a responsible community member and put in as much as you get out. If you find DeepChem very valuable, please consider contributing back some innovations or improvements so others can benefit. If you’re getting a patent on your invention, try to make sure that you don’t infringe on anything in DeepChem. Lots of things sneak past patent review. As an open source community, we don’t have the resources to actively defend ourselves and we rely on your good judgment and help!
Not at all! DeepChem is released with a permissive MIT license. Any analyses you perform belong entirely to you. You are under no obligation to release your proprietary data or inventions.
If you are interested in open sourcing data, the DeepChem project maintains the [MoleculeNet](https://deepchem.readthedocs.io/en/latest/moleculenet.html) suite of datasets. Adding your dataset to MoleculeNet can be a powerful way to ensure that a broad community of users can access your released data in convenient fashion. It’s important to note that MoleculeNet provides programmatic access to data, which may not be appropriate for all types of data (especially for clinical or patient data which may be governed by regulations/laws). Open source datasets can be a powerful resource, but need to be handled with care.
Not anymore! Any scientific datasets are welcome in MoleculeNet. At some point in the future, we may rename the effort to avoid confusion, but for now, we emphasize that non-molecular datasets are welcome too.
MoleculeNet already supports datasets released under different licenses. We can make work with you to use your license of choice.