Building an adverse event clustering model pipeline on SageMaker
Now let us build a pipeline to train an adverse event clustering model. The purpose of this pipeline is to cluster adverse events detected in drug reviews using an unsupervised clustering model. This can help investigators group drugs with certain reported clinical conditions together and facilitates investigations related to adverse events. We will read some raw drug review data and extract top clinical conditions from that data. Let us now look at the details of the workflow. Here is a diagram that explains the steps of the solution:
Figure 9.1 – The pipeline workflow
As shown in the preceding diagram, we use Amazon Comprehend Medical to extract clinical conditions from the raw drug reviews. These clinical conditions are reported by the end users as adverse events while taking the drug. We take the top five clinical conditions as relevant topics on which we would like to cluster. Clustering...