The tool has a community, and people are very helpful. I've been impressed by that. I'm trying to think of ways I want to improve. I did start it up on a Windows PC. It seemed a bit easier for some reason, but I'm now on a Mac. I have trouble with the magic mouse, but that's an issue with any software not specific to KNIME. It worked well when I received data from other people. I haven't tested it on a ten-billion-item database, but it works well for hundreds of thousands to a million items. I haven't tried the big databases because that's for the commercial version.
The solution has a lot of video material for teaching and upskilling people. If I've got a colleague that I just introduced to it, I send them links to two videos, and they learn from that. That's two hours of work for them in two sessions, and they've got a programming language under their belt.
For instance, I've never used it to build a neural net, though it has many different neural nets. I have used it for decision trees, regression models, and so on, and KNIME has been very easy to use for those. It takes much of the work away from you. For example, if I build a decision tree, I usually want to take several samples from your training data and then choose the best outcome.
With the tool, you just put the Decision Tree Learner node in there, set what percentage you want for training and testing, say 60% for learning and 40% for testing, and how many times you want to do it, like seven times. It runs seven samples, does the training and testing, and reports back to produce a model. Then, you can use a Scorer node to score your results. I produced the first-ever machine learning exercise in nursing in 2008, which wasn't using KNIME, but I think KNIME was available then. That project was associated with a reduction in the dropout rate from the university from 19% to five percent over three years. And you can do all that for free with KNIME, even with a dataset of about 1,200 students, on your PC with no trouble.
I read a report from Uppsala in Sweden where, during the pandemic, they had to repurpose many drugs developed for other diseases to find out which ones would be useful for COVID. They went through thousands and millions of drugs, and it seemed to reduce the time to develop or choose drugs by a major factor.
The solution accessed about eight or ten different chemical databases, biobanks, and similar resources and could integrate with all of them. One of the good things about KNIME is its ability to read different data sources. For instance, it has platform connectors, allowing you to link to various external systems. It's just a matter of plugging in a node and linking it up without usually needing to write new code. For example, in my use case, reading from an Excel file used to require configuring a node manually, but now you choose the file you want to analyze, put it on your workspace, and it recognizes the file structure automatically, whether it's an Excel file, CSV, or tab-delimited file, and provides the correct node. It's quite intelligent.
If you’re considering using the solution for the first time, I recommend starting with a one-hour introductory video to get a basic understanding. If you need more information, check out KNIME TV. They have a channel with hundreds of short videos, usually five to 30 minutes long. These videos show you how to drag and drop nodes, configure them, and run them to see the results. It’s a very visual way of learning.
I rate the overall product a nine out of ten.