3.4. Techniques applied to the data linking process
Identity link discovery (also called linkset discovery) requires a three step process to identify equivalent resources across different datasets: prepare data (preprocessing, step 1), align resources (instance matching, step 2) and fix erroneous links generated between some of them (post-processing, step 3). First, the resources need to be represented in a uniform manner. This preprocessing proves necessary when we deal with different vocabularies, when resources are valued by using different languages, or when the number of resources and properties to be compared is too high. To establish links, it is important to compare resources regarding their values. However, the comparison can be done at different levels going from the URI of resources to the description of their neighborhoods in the RDF graph. Finally, once equivalent resources are connected, some systems perform an additional step to evaluate the generated links and therefore...