Mass imports/uploads

Overview

This API makes it possible to carry out mass file uploads.
The process is based on jobs and import items attached to these jobs.
An element to be imported consists of a binary and qualification data: the item is a temporary object, which does not exist in the database, but whose simulated view can be obtained at any time.
A job is attached to a user. The process consists in attaching items to be imported to this job, qualifying them, then launching the actual creation of the final objects.
Item qualifications and object creations are asynchronous (asynchronous job works).

An item corresponds to any type of asset. A qualifying property of the same name in two different asset types must absolutely be of the same type (and nature if applicable).

End points

standard URI

All standard endpoints for massimport starts with /api/rest/massimport.

standard generic URI

End points that start with /api/rest/massimport/data have the same characteristics as standard end points of the type /api/rest/{dam/}data.

catalog

GET /api/rest/massimport or GET /api/rest/massimport/data (or GET /api/rest/massimport/catalog)

To get headers, use parameter headers:

GET /api/rest/massimport?headers=true

Jobs end points

Get headers

GET /api/rest/massimport/job/headers

Get job data

GET /api/rest/massimport/job/{jobid}

Parameters:

  • withItems, boolean (false by default)
    export also job items data

  • withSimu, boolean (false by default)
    export simulation for each item

  • resetSimu, boolean (false by default)
    if withSimu=true, force reset of the simulation cache

  • maxItems, int (negative means infinite), limit the number of items (default is infinite)

Get job list

GET /api/rest/massimport/job for infinite list

GET /api/rest/massimport/job/list for paginated list

Parameters:

Any standard parameters of list or search query, plus specific parameters (see Get job data)

Get job item list

GET /api/rest/massimport/job/{jobid}/item for infinite list

GET /api/rest/massimport/job/{jobid}/item/list for paginated list

you can use item or items

Parameters:

Any standard parameters of list or search query, plus specific parameters (see Get job data)

Show workflow of job

GET /api/rest/massimport/job/workflow

Show new possible workflow action

GET /api/rest/massimport/job/{jobid}/workflow

old way continue to work:

GET api/rest/massimport/job/{jobid}?workflows=true&withData=false

Create a new job

POST /api/rest/massimport/job

The jobowner is automatically set to surfer that invoke the end point.

Create a new job with data (multipart/form-data)

Provide a json in a field named json (or in a field with your wanted name, and a parameter named jsonproperty to set this name, in form or in query).

By example, set the job name and the job owner (id=11):

curl --request POST --url https://<host>/api/rest/massimport/job \ --header 'authorization: xxxx' --header 'content-type: multipart/form-data; boundary=---011000010111000001101001' form 'json={"name": "test","jobowner": 11}'

Create a new job and attach items

You can attach items to the new job with field named itemIds:

To specify queries to select assets, user the prefix itemIds_ followed by the name of the asset resource (example: itemIds_asset={“owner”:”123”}

Example, with list of items (here ids 1 and 2):

curl --request POST \ --url https://<host>/api/rest/massimport/job \ --header 'authorization: xxxx' \ --header 'content-type: multipart/form-data; boundary=---011000010111000001101001' \ --form 'json={ "name": "test", "itemIds": [1,2] }'

Example with query (here, all items created on june, the 3rd, 2020):

curl --request POST \ --url https://<host>/api/rest/massimport/job \ --header 'authorization: xxxx' \ --header 'content-type: multipart/form-data; boundary=---011000010111000001101001' \ --form 'json={ "name": "test", "itemIds": { "created": {"eq": "20200603"} } }'

In creation, this is a simplification of the general syntax (see Add or remove items (asynchronous)).

Example with independent field:

Parameters can be set in query:

  • add_itemIds (or itemIds): to add items by id (multiple values)

  • add_itemIds_query to add items by query (single value)

  • remove_itemIds: to remove items by id (multiple values)

  • remove_itemsIds_query to remove items by query

  • add_itemIds_ followed by resource name to add assets by id (creating item for asset)

  • remove_itemIds_ followed by resource name to remove assets items by id

  • add_itemIds_query_ followed by resource name to add asset by query

  • remove_itemIds_query_ followed by resource name to remove asset items by query

Delete job

DELETE /api/rest/massimport/job/{id}

parameters:

  • withItems (boolean, default is false): if true, reports attached items

  • deleteItems (boolean, or string, default is false): if true, remove items, if false, detach item (set job to null). If value is an id of a job, attach items to the job.

  • withSecondary (boolean, default is true), delete also secondary jobs

Items end points

Headers

GET /api/rest/massimport/item/headers

Create a new item (binary in body)

Use POST /api/rest/massimport/item, with binary file in body

The filename (this name is the store final filename and will be use to determinate mimetype if the file) must be provided

  • either by content-disposition header
    Example: content-disposition: atachment; filaname=”filename.jpg”

  • or query parameter
    Example: filename=filename.jpg

The content-type must be provided by content-type header, and can be anything but application/json. Do not use application/json, because this content-type is reserved for metadata : to upload a binary json file, use any other content type (i.e. application/octet-stream). The final mimetype of the file will be determinated from the filename extension.

Example (curl):

Create a new item (multipart/form-data)

Use POST /api/rest/massimport/item, with content-type multipart-formdata

The binary file must be provided in the field named binary. A filename, other than the field content-disposition, can be provided with the query parameter named prop_binary or multipart/form-data field binary : this filename will be the final filename.

Example (curl, with filename in multipart/form-data):

In this mode you can specify data for the item in a json. See the

Example setting target job and name (curl):

Create a new item from an asset

Made as if to create an item from scratch, and on top of that, use asset parameter (or $asset) to specify the source asset as:

  • uid: type then id, separated with underscore (example: asset_42)

  • uuid

The parameter can be included in the json.

Example (curl), with json in body, targetting job 2:

Add or remove items (asynchronous)

The parameter itemIds used in a job modification request allows to add items to the job or remove items from the job (detach them make them without any job). This parameter can be used as separated parameters (in multipart/form-data field, in query parameter…) or inside the json data description sent to modify properties of the job.

the json contains one or both of the following properties:

  • add

    • an array of item ids (string or number)

    • a jsonquery to select items

  • remove

    • an array of item ids (string or number)

    • a jsonquery to select items

Example to add items:

Example to add all items created on june, the 3rd 2020:

When creating a job, as you can only add items, you can simplify the json by indicating only the value of the add property.

Example to create a job with name “test” with all data in the json description:

Same example with simplified syntax:

Same example with itemIds in its own field:

You can also use query parameters, one to add with name add_itemIds, one to remove with the name remove_itemIds, for ids

Example:

You can also use query parameters, one to add with name add_itemIds_query, one to remove with the name remove_itemIds_query, for query

Example:

The ways are combinable, but if there is a parameter in the multipart/form-data, its query equivalent is ignored.

The request invocation ends immediately and the modifications are scheduled to be done asynchronously. Until the demands are not finished, the job is marked outstanding (you can get this state in the status section of a job data).

Add or remove items (asynchronous)

Do a request on an item to modify the property job (or set it when creating the item).

Example:

Qualify items (asynchronous)

You can schedule items qualification on a job, with the parameter metadata. This parameter works like itemIds parameter (you can put in json, in a multipart/form-data field, in a query parameter).

The value is a json with :

  • a property ids, to list the item ids you want to qualify (an array of ids, or a single id, if there is only one items to change)

  • tagging data, one or both following properties:

    • type ( or datatype) to set the target type (if different of defaultcollection of the job). If not set, and there is a $resource in data property value, this one is used

    • data: a json, with the same syntax as any API modification

    • workflow : a workflow action name or id

Example:

Example:

 

Qualify item (synchronous)

To qualify an item, modify it with the parameter metadata, the same as when qualifying items asynchronously (by job), but without ids.

The syntax is the same as standard modification. If you use a json, if a property metadata exists in the resource item, use $metadata.

Configuration (API)

A minimal configuration is automatically generated for the both job and item structures. Use the tag REST_API_MASS_IMPORT_INCLUDE to export a field which is not automatically selected. Use the dam/data tags to configure features (see https://crossmedia.atlassian.net/wiki/spaces/WD/pages/731316242 )

Job

All the features are allowed by default:

  • read

  • list

  • create

  • update

  • workflow

  • delete

  • locks

Jobs are secured by default.

Fields exported by default:

  • id

    • read

    • list

    • topsearch

    • no update

  • name

    • read

    • list

    • topsearch

    • update

    • not required for create

    • not required for update

    • not required for patch

  • jobowner (the user who owns the job)

    • read

    • list

    • topsearch

    • update

    • required for create

    • not required for update

    • not required for patch

    • default value : surfer id

  • previousowner (the previous owner of the job, after the job has been transferred to another owner)

    • read

    • update

    • not required for create

    • not required for update

    • not required for patch

  • status

    • read

    • list

    • topsearch

    • no update (use workflow end point to modify this)

  • jobprogress (a json that describes the progress of a job after it has been started)

    • read

    • no update

  • expected items

    • read

    • list

    • topsearch

    • update

    • not required for create

    • not required for update

Item

All the features are allowed by default:

  • read

  • list

  • create

  • update

  • workflow

  • delete

  • locks

Jobs are secured by default.

Fields exported by default:

  • id

    • read

    • list

    • topsearch

    • no update

  • name

    • read

    • list

    • topsearch

    • update

    • not required for create

    • not required for update

    • not required for patch

  • status

    • read

    • list

    • topsearch

    • no update (use workflow end point to modify this)

  • job

    • read

    • list

    • topsearch

    • update

    • not required for create

    • not required for update

    • not required for patch

  • binary

    • read

    • list

    • update

    • required for create

    • not required for update

    • not required for patch

  • objectid

    • read

    • list

    • topsearch

    • no update

  • objecttype

    • read

    • list

    • topsearch

    • no update

  • rejectreason

    • read

    • no update

Configuration (plug-in)

  • enableMassImportServices: boolean, default is true
    enable or disable mass import support in REST API

  • massImportItemObjectName: string, default massimportitem
    name of structure to store data for item

  • massImportCreateWithJobOwner: boolean, default is true
    indicates, if true, that the job owner is being used to create (or simulate), or, if false, that the job execution user is being used.

  • massImportRemoveItemAfterCreate: boolean, default is true
    if true, triggers the deletion of an item as soon as the corresponding asset is created (or modified during mass tagging)

  • massImportSimuWithFaces: boolean, default is true
    indicates whether faces information should be exported in the simulations

  • 2021.3.0 massImportAsyncStoreCache: boolean, default is true
    enables asynchronous storage of the simulation cache

  • massImportJobConfig: json, empty for defaults
    a configuration for the job execution
    Here is the default configuration used and which you can use as a basis for adapting it:

    • jobProcessorThreadCount: int, default is 5
      the number of threads used by the execution service to perform asynchronous end object creation tasks

    • changeProcessorThreadCount: int, default is 5
      the number of threads used by the execution service to perform asynchronous item qualification application tasks.

    • stopChangesImmediatly: boolean, default is true
      if true, when the plug-in is stopped, we wait until all current tasks have been completed

    • keepOldJobCount: int, default is 1000
      number of completed tasks to be retained for follow-up (tasks are only retained until restarted and by cluster instance if applicable)

    • jobLockAcquireTimeout: long, default is 10000
      maximum time ((in milliseconds) to acquire a lock for an execution task (qualification, or creation)

    • jobLockTimeout: long, default is 10000
      maximum time (in milliseconds) to retain a lock for an execution (qualification, or creation) task

    • saveJobLockAcquireTimeout: long, default is 10000
      maximum time (in milliseconds) to acquire a lock to save a job

    • saveJobLockTimeout: long, default is 10000
      maximum time (in milliseconds) to retain a lock to save a job

    • saveItemLockAcquireTimeout: long, default is 10000
      maximum time (in milliseconds) to acquire a lock to save an item

    • saveItemLockTimeout: long, default is 10000
      maximum time (in milliseconds) to retain a lock to save an item

    • jobRole: string, default is 4
      role of user used by job executor

    • 2021.3.0 cacheStoreThreadCount: int, default is 3
      the number of threads used to store simulation data cache asynchronously (when massImportAsyncStoreCache is true)

    • 2021.3.0 cacheStoreThreadIdleTime: int, default is 300 (5×60)
      thread (used for asynchronous simulation cache storage) idle time in seconds (when massImportAsyncStoreCache is true)

    • 2021.3.0 cacheStoreQueueSize: int, default is 100
      queue size for asynchronous simulation cache storage (when massImportAsyncStoreCache is true)

    • 2022.1.0 setChangesCompletionMinThreadCount, int, default is 5
      the minimum number of threads used to process applications for mass import item change orders asynchronously (when massImportSetChangesCompletionServiceEnabled is true) 

    • 2022.1.0 setChangesCompletionMaxThreadCount, int, default is 10
      the maximum number of threads used to process applications for mass import item change orders asynchronously (when massImportSetChangesCompletionServiceEnabled is true) 

    • 2022.1.0setChangesCompletionThreadIdleTime, int, default is 300 (5x60)
      thread (used to process applications for mass import item change orders asynchronously) idle time (when massImportSetChangesCompletionServiceEnabled is true) 

    • 2022.1.0setChangesCompletionQueueSize
      queue size for applications of mass import item change orders (when massImportSetChangesCompletionServiceEnabled is true)

    • 2022.1.0setChangesCompletionThreshold
      number of items from which the task of applications of mass import item change orders is parallelized

    • 2022.1.0setChangesCompletionThresholdFork
      number of items from which the task of applications of mass import item change orders is forked to a dedicated pool

    • 2022.1.0setChangesCompletionForkPoolSize
      the number of dedicated forking thread pool used to process applications for mass import item change orders (when massImportSetChangesCompletionServiceEnabled is true) 

    • 2022.1.0setChangesCompletionForkThreadCount
      the number of threads of forking thread pool, used to process applications for mass import item change orders with a dedicated forked pool (when massImportSetChangesCompletionServiceEnabled is true) 

    • 2022.2.0 massImportJobOutstandingVirtual
      Leave this parameter at its default value (false)

    • 2022.2.0 massImportPurgeConfig
      jobs and items purge setup

      • enabled
        boolean to enable the purge process (default is true)

      • runAtStart
        boolean to activate a purge task launch at plugin startup (default is true)

      • schedule
        boolean to activate a scheduled execution (default is true)

      • orphanItems
        boolean to activate the purge of orphan items (items that are attached to a job that does not exist or to no job) (default is true)

      • emptyJobs
        boolean to activate the purge of empty jobs (jobs without items) (default is true)

      • processPrimaryJobs
        boolean indicating that the primary jobs should be processed (default is true)

      • processSecondaryJobs
        boolean indicating that secondary jobs should be processed (default is true)

      • processNotDoneUnreferencedJobs
        boolean indicating that we must purge the secondary jobs that are not finished, but that are attached to no primary job (or attached to a non-existent job) (default is true)

      • logLevel
        the default log level of the purge system (default is INFO)

      • primaryRetentionTime
        the retention time of primary jobs (the time after which a job identified as to be purged will be purged) (default is 45 days)

      • secondaryRetentionTime
        the retention time of secondary jobs (the time after which a job identified as to be purged will be purged) (default is one day)

      • itemRetentionTime
        the retention time of items (the time after which an item identified as to be purged will be purged) (default is 15 days)

      • timing
        a boolean that activates the timing of the purge tasks (uses a gate) (default is false)

      • surferRole
        the role ID for the surfer used to delete objects (default is 4). Security is not taken into account during the purging task, but a surfer is required.

      • surferUserId (optional)
        the user ID or the surfer used to delete objects (If not indicated, we take the first user which is the role configured by surferRole)


Extensions (business services)

The class com.noheto.restapi.MassImport from jar restapibs.jar (copy it from plug-in WXM-RESTAPI) provides an API you can use in you own plug-in, to retrieve information about the items to be imported.

Get item target type in your extensions

String MassImport.getDataType(String/id|IObjectReadonly)

Get simulation data in your extensions

com.google.gson.JsonObject MassImport.getItemSimu({wsnoheto.engine.CTSurfer,}{org.apache.log4j.Logger,}String/id|IObjectReadonly,boolean/resetCache)

org.json.JSONObject MassImport.getItemSimuJson({wsnoheto.engine.CTSurfer,}{org.apache.log4j.Logger,}String/id|IObjectReadonly,boolean/resetCache)

Structures

Bases

  • massimportjob
    you must use this object and it must have the following fields:

Code

Type

Nature

Default

List

View

Edit

Mandatory

Null if empty

Indexed

Tags

Code

Type

Nature

Default

List

View

Edit

Mandatory

Null if empty

Indexed

Tags

id

identity

 

 

 

 

 

 

 

 

created

datetime

 

 

 

 

 

 

 

rest_api_mass_import_include

modified

datetime

 

 

 

 

 

 

 

rest_api_mass_import_include

owner

 

 

 

 

 

 

 

 

rest_api_mass_import_include

parent

sentence

 

 

 

 

 

 

 

 

 

child

word

 

 

 

 

 

 

 

 

 

status

child

wkfmassimportjob

 

 

 

 

 

 

 

 

name

word

 

 

 

 

 

 

activated

child

activated

1

 

 

 

 

 

 

 

changes

text

 

[]

 

 

 

 

 

 

 

jobowner

child

user

 

 

 

expecteditems

integer

 

0

 

 

 

 

 

 

 

defaultcollection

word

 

 

 

 

 

rest_api_mass_import_include rest_api_create_not_required rest_api_update_not_required

outstanding

child

activated

2

 

 

previousowner

child

user

 

 

 

 

 

activestep

word

 

 

 

 

 

rest_api_mass_import_include rest_api_create_not_required rest_api_update_not_required

primaryjob

child

massimportjob

 

 

 

 

rest_api_mass_import_include rest_api_create_not_required rest_api_update_not_required

jobprogress

text

 

 

 

 

 

 

 

 

restapi_json

  • massimportitem
    item structure. You can use you own object, but you it must have the following fields, in addition to the standard fields:

Code

Type

Nature

Default

List

View

Edit

Mandatory

Null if empty

Indexed

Tags

Code

Type

Nature

Default

List

View

Edit

Mandatory

Null if empty

Indexed

Tags

id

identity

 

 

 

 

 

 

 

 

created

datetime

 

 

 

 

 

 

 

rest_api_mass_import_include

modified

datetime

 

 

 

 

 

 

 

rest_api_mass_import_include

owner

 

 

 

 

 

 

 

 

 

parent

sentence

 

 

 

 

 

 

 

 

 

child

word

 

 

 

 

 

 

 

 

 

status

child

wkfmassimportjob

 

 

 

 

 

 

 

 

name

word

 

 

 

 

 

 

activated

child

activated

1

 

 

 

 

 

 

 

changes

text

 

[]

 

 

 

 

 

 

 

job

child

massimportjob

 

 

 

 

 

binary

file

 

 

 

 

 

 

dam_asset_raw_file_field gallery_multi_upload

objectid

word

 

 

 

 

 

 

objecttype

word

 

 

 

 

 

 

changescache

text

 

 

 

 

 

 

 

 

 

rejectreason

text

 

 

 

 

 

 

 

 

 

datachanges(i18n)

text

 

 

 

 

 

 

 

 


itemisvalid2021.4.0 OPTIONAL

child

activated

2

 

 

 

 


REST_API_MASS_IMPORT_INCLUDE

itemhaswarnings2021.4.0 OPTIONAL

child

activated

2

 

 

 

 

REST_API_MASS_IMPORT_INCLUDE

 

Workflows

  • wkfmassimportjob / wkfmassimportjobaction
    This workflow must not be modified

draft

pending

done

paused

cancelled

draft

pending

done

paused

cancelled

start

 

 

pause

 

 

restart

 

 

cancel

cancel

 

finish

 

 

  • wkfmassimportitem / wkfmassimportitemaction

failed

rejected

draft

duplicate

index

approved

processing

imported

failed

rejected

draft

duplicate

index

approved

processing

imported

 

markduplicate

 

 

index

 

 

reject

 

 

validatebinary

 

 

invalidatebinary

 

 

index

 

 

reject

 

 

complete

 

reject

 

 

approve

 

 

import

 

fail

 

 

index

 

 

approve

 

Fulltext indexation of simulation data

To allow fulltext search within items, item structure must have i18n datachanges field, as well as the associated fields for each desired locale, and these fields must be indexed.

Check the structures

You can check if structures or workflows are valid in the plug-in status page.

Job handling

Create a job

Get job data

Follow a job (workflow and outstanding state)

Control job (workflow)

Primary or secondary job

Delete a job

Job progress

Item handling

Get item data

Get final object simulation

Upload binary (A single media)

Upload archives (Several media at once)

Simulation

Reject reason

Asynchronous operations