Science of Machine Learning (was Machine Learning , some thoughts)

Philippe Ameline philippe.ameline at
Sat Jun 30 11:16:32 EDT 2018

Le 27/06/2018 à 22:26, Bert Verhees a écrit :

> On 27-06-18 16:43, Philippe Ameline wrote:
>> 1) you can find a bunch of practitioners that agree on working extra
>> hours to comment a big bunch of images, or
> Did I tell you about the plant-app? I believe I did. 700.000 pictures
> are reviewed, often by volunteers.
> The app recognizes 16000 plants. Important is how you do it, and that
> it does not cost effort by the volunteers, for example in relation to
> what they do anyway.
> It is a French product.

Dear Bert,

The plant-app was the subject of your initial post.

The math in support of deep learning are being studied. To make it
short, it remains somewhat mysterious since such classification
algorithms "should not work", but actually, they do ;-)

From an article I just read, such NP complete algorithms are similar to
finding a needle in a hay stack and shouldn't provide valuable
answers... unless the conditions (large enough needle, correctly ordered
stack) make the problem handy.

To sum it up, data quality (signal over noise ratio) is paramount. In
the plant-app you mentioned, adding a certain level of fuzziness
(improperly labeling images or adding images of objects that are not
plants) could probably make the whole app plainly crappy.

Just to say that building a deep learning system starts from making
certain that the data it will be fed with are of proper quality. This is
usually not the case in medicine, largely because IT is considered a
back office concept and there is seldom the kind of feedback loop that
could lead to having errors fixed.

My point is that you can perfectly (but with considerable efforts)
organize a trained network of practitioners to feed a "data lake" in
order to train a neural network... but will probably be disappointed if
you try to process existing information.



More information about the openEHR-clinical mailing list