public class ExtendTable
extends Controller
Fetch Table Functions: - the functions search() and extendedSearch() don't return new data, only the correspondences necessary for fusing the new data. In order to get the new data, the fetchTable-function has to be called.
Repository Maintenance Functions:
Bulk upload functions: - for uploading many tables to a repository, the bulk upload has more than an order of magnitude better performance (see DS4DM Backend website)
Constructor and Description |
---|
ExtendTable() |
Modifier and Type | Method and Description |
---|---|
void |
bulkUploadTables(java.lang.String repositoryName,
java.lang.String uploadId,
long startTime,
java.lang.String requestBody)
bulkUploadTables
For uploading many tables to a repository, the bulk upload has more than an order of magnitude better performance - see evaluation. |
Result |
correlationBasedSearch(java.lang.String repositoryName)
The
unconstrainedSearch(String repositoryName) extends the query table with as many columns as possible. |
Result |
createRepository(java.lang.String repositoryName)
Create a new empty repository with the specified name.
|
Result |
deleteRepository(java.lang.String repositoryName)
Deletes the specified repository.
|
Result |
extendedSearch_Produktdata()
If the DS4DM Frontend does not specify the repositoryName, then this function can be used instead.
|
Result |
extendedSearch_T2DGoldstandard()
If the DS4DM Frontend does not specify the repositoryName, then this function can be used instead.
|
Result |
extendedSearch_WebWikiTables()
If the DS4DM Frontend does not specify the repositoryName, then this function can be used instead.
|
Result |
extendedSearch(java.lang.String repositoryName)
This is an old version of the extendedSearch.
|
Result |
fetchTable_Produktdata(java.lang.String name)
If the DS4DM Frontend does not specify the repositoryName, then this function can be used instead of
fetchTable(String, String)
It calls fetchTable(String, String) with the repositoryName set to "Produktdata". |
Result |
fetchTable_T2DGoldstandard(java.lang.String name)
If the DS4DM Frontend does not specify the repositoryName, then this function can be used instead of
fetchTable(String, String)
It calls fetchTable(String, String) with the repositoryName set to "T2D_Goldstandard". |
Result |
fetchTable_WebWikiTables(java.lang.String name)
If the DS4DM Frontend does not specify the repositoryName, then this function can be used instead of
fetchTable(String, String)
It calls fetchTable(String, String) with the repositoryName set to "WebWikiTables". |
Result |
fetchTable(java.lang.String name,
java.lang.String repositoryName)
The
ExtenededSearch.extendedSearch(String) -methods searches for tables in the repository that contain useful data for extending a table with an additional column. |
Result |
fetchTablePOST(java.lang.String repositoryName)
The fetchTablePOST function is rarely used in practice.
|
Result |
generateCorrespondences_withKnownBlocking(java.lang.String repositoryName,
java.lang.String blockingsFileName)
This function is not part of the standard Backend API.
|
Result |
generateCorrespondences(java.lang.String repositoryName)
This function is not part of the standard Backend API.
|
Result |
getRepositoryNames()
Returns the names of all existing repositories.
|
Result |
getRepositoryStatistics(java.lang.String repositoryName)
Returns information about the specified repository, such as numberOfTablesInRepository, created_timestamp and creator_ip.
|
Result |
getUploadStatus(java.lang.String repositoryName,
java.lang.String uploadID)
This function returns the status of the
moderateBulkUploadTables(String) -job with the specified uploadID. |
Result |
ind() |
Result |
moderateBulkUploadTables(java.lang.String repositoryName)
This method manages the
controllers.ExtendTable#bulkUploadTables(String, String, long) on an high level. |
Result |
PreCalculatedSearch() |
Result |
search()
------------------------------------------------------------------------
search()
------------------------------------------------------------------------
The matching is performed by the SearJoin Service, via a POST request to
http://ds4dm.informatik.uni-mannheim.de/search where the input is the
table specified above.
|
Result |
suggestAttributes() |
Result |
unconstrainedSearch(java.lang.String repositoryName)
search() and ExtenededSearch.extendedSearch(String repositoryName) provide the information necessary for extending a table with one additional column. |
Result |
uploadTable(java.lang.String repositoryName)
This uploads a single Table to a specified repository.
|
public Result search()
public Result PreCalculatedSearch()
public Result extendedSearch(java.lang.String repositoryName)
ExtenededSearch.extendedSearch(String repositoryName)
public Result extendedSearch_T2DGoldstandard()
ExtenededSearch.extendedSearch(String repositoryName)
with the repositoryName set to "T2D_Goldstandard".public Result extendedSearch_Produktdata()
ExtenededSearch.extendedSearch(String repositoryName)
with the repositoryName set to "ProductDataRepository_withSubjectcolumns".public Result extendedSearch_WebWikiTables()
ExtenededSearch.extendedSearch(String repositoryName)
with the repositoryName set to "WebWikiTables".public Result unconstrainedSearch(java.lang.String repositoryName)
search()
and ExtenededSearch.extendedSearch(String repositoryName)
provide the information necessary for extending a table with one additional column.UnconstrainedSearch.getFusedTable(model.QueryTable queryTableObject, String repositoryName)
repositoryName
- the name of the repository used for the unconstrainedSearchrequest().body().asJson()
- the Json in the body of the http-post-request is also used as parameter. There is more info on it here.public Result correlationBasedSearch(java.lang.String repositoryName) throws java.io.IOException, java.lang.InterruptedException
unconstrainedSearch(String repositoryName)
extends the query table with as many columns as possible. When this is done with a big repository, this can lead to significantly more than 100 columns being added to the query table.
This can be overwhelming for the user. This is why the correlationBasedSearch is useful, it extends the query table only with columns that correlate with the "correlation attribute" (specified by the parameter "correlationAttribute" in the http-request-body).UnconstrainedSearch.getFusedTable(model.QueryTable, String)
repositoryName
- the name of the repository used for the unconstrainedSearchrequest().body().asJson()
- the Json in the body of the http-post-request is also used as parameter. There is more info on it here.java.io.IOException
java.lang.InterruptedException
public Result getRepositoryNames()
public Result getRepositoryStatistics(java.lang.String repositoryName)
repositoryName
- the name of the repository for which the information should be returnedpublic Result deleteRepository(java.lang.String repositoryName)
repositoryName
- the name of the repository being deletedrequest().body().asJson()
- (optional) admin-password in the body of the http-post-request.public Result createRepository(java.lang.String repositoryName)
repositoryName
- the name of the repository being createdpublic Result uploadTable(java.lang.String repositoryName)
TableIndexer.writeTableToIndexes(File)
FindCorrespondences.getInstanceMatches(DataSet, DataSet, File, File, String)
FindCorrespondences.getDuplicateBasedSchemaMatches(DataSet, DataSet, File, File, Processable, String)
repositoryName
- the name of the repository to which the table should be uploaded torequest().body().asJson()
- The request body should contain the table data of the table that is to be uploaded.public Result suggestAttributes()
public Result moderateBulkUploadTables(java.lang.String repositoryName) throws java.io.IOException
controllers.ExtendTable#bulkUploadTables(String, String, long)
on an high level.
Amongst others, it ensures that the bulkUpload is executed as a separate process that doesn't block the webservice.controllers.ExtendTable#bulkUploadTables(String, String, long)
in a separate process
repositoryName
- the name of the repository to which the tables should be uploaded torequest().body().asJson()
- The request body should contain a list of tables to be uploaded. The format of this json-string is specified herejava.io.IOException
public void bulkUploadTables(java.lang.String repositoryName, java.lang.String uploadId, long startTime, java.lang.String requestBody) throws java.io.IOException
moderateBulkUploadTables(String)
.
de.uni_mannheim.informatik.dws.ds4dm.CreateLuceneIndex.Main#main(String[])
Main.main(String[])
repositoryName
- the name of the repository to which the tables are being uploadeduploadId
- The uploadId is a code that can be used by getUploadStatus(String, String)
to check the status of this bulkUpload while (and after) it's being executed.startTime
- the time when the execution of moderateBulkUploadTables(String)
began. This is only for evaluation purposes.requestBody
- The requestBody should contain a list of tables to be uploaded. The format of this json-string is specified herejava.io.IOException
public Result getUploadStatus(java.lang.String repositoryName, java.lang.String uploadID)
moderateBulkUploadTables(String)
-job with the specified uploadID. When a moderateBulkUploadTables(String)
-job is started, it returns its uploadID in the return-message.
During the acctual bulkupload (which is running in a seperate process), the status of the bulkUpload is written to a file. The status may be "PROCESSING", "UPLOAD SUCCESSFUL\n" or "UPLOAD UNSUCCESSFUL\n".
The getUploadStatus reads the status from the file and returns it to the user
repositoryName
- the name of the repository to which the tables are being uploadeduploadId
- The uploadId is the unique identifier of a uploadProcess. It is returned by moderateBulkUploadTables(String)
public Result generateCorrespondences(java.lang.String repositoryName) throws java.io.IOException
java.io.IOException
public Result generateCorrespondences_withKnownBlocking(java.lang.String repositoryName, java.lang.String blockingsFileName) throws java.io.IOException
java.io.IOException
public Result fetchTablePOST(java.lang.String repositoryName)
fetchTable(String, String)
-GET-method instead.
The GET- and the POST- method work in exactly the same way. More info here: fetchTable(String, String)
public Result fetchTable_T2DGoldstandard(java.lang.String name)
fetchTable(String, String)
It calls fetchTable(String, String)
with the repositoryName set to "T2D_Goldstandard".public Result fetchTable_Produktdata(java.lang.String name)
fetchTable(String, String)
It calls fetchTable(String, String)
with the repositoryName set to "Produktdata".public Result fetchTable_WebWikiTables(java.lang.String name)
fetchTable(String, String)
It calls fetchTable(String, String)
with the repositoryName set to "WebWikiTables".public Result fetchTable(java.lang.String name, java.lang.String repositoryName)
ExtenededSearch.extendedSearch(String)
-methods searches for tables in the repository that contain useful data for extending a table with an additional column.
It returns the name of the found tables as well as it's correspondences to the query table (which are necessary for constructing the additional column). It however does not return the actual data from the found tables, as the http-response-messages would be too big.ExtenededSearch.extendedSearch(String)
-method, the found tables are saved to json-files (using ExtenededSearch.saveTableDataForFetching(model.ExtendedTableInformation, extendedSearch2.GlobalVariables)
).
The fetchTable-method just opens the requested json-file and returns its content.public Result ind()