ENTRY Functions - Weibo Connector. Software Version Administration Guide

The following information describes functions belonging to the ENTRY group.

[prefix_]preProcessEntry

Use this function to perform any operations at the start of processing the entry.

Inputs: xml (XmlDocument), document (Document)

Outputs: Boolean or no return value. No return value is treated as returning true. If the function returns false, the entry is not processed and is ignored.

See also:

Example

function myPrefix_preProcessEntry(xml, document) set_param("ENTRY_ID") = xml:XPathValue("~/atom:id") end

[prefix_]postProcessEntry

Use this function to perform any operations at the end of processing the entry.

Inputs: xml (XmlDocument), document (Document)

Outputs: Boolean or no return value. No return value is treated as returning true. If the function returns false, the data extracted for the entry is discarded and not processed further. Any additional feeds queued for processing as a result of this entry are still processed.

See also:

Example

function myPrefix_postProcessEntry(xml, document) print(get_param("ENTRY_ID"))

end

[prefix_]getEntryDataUrls

Use this function to return URLs containing additional entry metadata to process with optional function prefixes. The URLs are not treated as feeds themselves and are processed only to extract

metadata/content for the current entry. Wherever possible, feed URLs should be used that contain sufficient data to avoid the need for this function.

Inputs: xml (XmlDocument), document (Document)

Outputs: urls (list:{string} or table:{string->string} or table:{string->list:{string}})

See also:

Example

function myPrefix_getEntryDataUrls(xml, document)

return { "http://...", prefix1="http://...", prefix2={"http://...",

"http://..."} } end

[prefix_]getEntryDataUrlXPaths

Use this function to return XPaths referencing URLs containing additional entry metadata to process with optional function prefixes. The URLs are not treated as feeds themselves and are processed only to extract metadata/content for the current entry. Wherever possible, feed URLs should be used that contain sufficient data to avoid the need for this function.

Inputs: xml (XmlDocument), document (Document)

Outputs: xpaths (list:{string} or table:{string->string} or table:{string->list:{string}})

See also:

Example

function myPrefix_getEntryDataUrlXPaths(xml, document) cache_result()

return { xpath, prefix1=xpath1, prefix2={xpath2, xpath3} } end

[prefix_]getEntryId

Use this function to return ID for the entry. This ID is used as the reference for the document. Either this function or getEntryIdXPath must result in a valid reference.

Inputs: xml (XmlDocument), document (Document) Outputs: id (string)

See also: [prefix_]getEntryIdXPath, below

Example

function myPrefix_getEntryId(xml, document) return "reference"

end

[prefix_]getEntryIdXPath

Use this function to return XPath referencing the ID for the entry. This ID is used as the reference for the document. Either this function or getEntryId must result in a valid reference. The function is not called if getEntryId has already returned a valid reference.

Inputs: xml (XmlDocument), document (Document) Outputs: xpath (string)

See also: [prefix_]getEntryId, above

Example

function myPrefix_getEntryIdXPath(xml, document) cache_result()

return "~/atom:id"

end

[prefix_]getEntryFeedUrls

Use this function to return additional feed urls from the current entry to be processed as separate feeds.

Inputs: xml (XmlDocument), document (Document)

Outputs: urls (list:{string} or table:{string->string} or table:{string->list:{string}})

See also:

Example

function myPrefix_getEntryFeedUrls(xml, document)

return { "http://...", prefix1="http://...", prefix2={"http://...",

"http://..."} } end

[prefix_]getEntryFeedUrlXPaths

Use this function to return XPaths referencing additional feed urls from the current entry to be processed as separate feeds.

Inputs: xml (XmlDocument), document (Document)

Outputs: xpaths (list:{string} or table:{string->string} or table:{string->list:{string}})

See also:

Example

function myPrefix_getEntryFeedUrlXPaths(xml, document) cache_result()

return { xpath, prefix1=xpath1, prefix2={xpath2, xpath3} } end

[prefix_]getEntryModifiedDate

Use this function to return the modified date of the current entry. If the function is defined, the returned date will be stored in the synchronize datastore, and will be used to check whether the entry has been updated.

Inputs: xml (XmlDocument), document (Document) Outputs: modifiedDate (string)

See also:

Example

function myPrefix_getEntryModifiedDate(xml, document) return "2011-11-11T11:11:11"

end

[prefix_]getEntryModifiedDateXPath

Use this function to return XPath that references the modified date of the current entry. If the function is defined and getEntryModifiedDate is not defined or did not return a date, the returned date will be stored in the synchronize datastore, and will be used to check whether the entry has been updated.

Inputs: xml (XmlDocument), document (Document) Outputs: xpath (string)

See also: [prefix_]getEntryModifiedDate, on the previous page

Example

function myPrefix_getEntryModifiedDateXPath(xml, document) cache_result()

return "~/atom:updated"

end

[prefix_]getEntryPropertyXPaths

Use this function to return property names mapping to XPaths referencing the property values to be set in the document identifier. The document identifier by default contains only the entry ID as the

reference. This would be typically used with the identifiers action where additional properties are required to identify the resource.

Inputs: xml (XmlDocument), document (Document)

Outputs: xpaths (table:{string->string} or table:{string->list:{string}})

See also:

Example

function myPrefix_getEntryPropertyXPaths(xml, document) cache_result()

return { TYPE="~/@type" } end

[prefix_]getEntryMetadata

Use this function to return field names mapping to values. DRECONTENT can be used as the field name when assigning into the document content. The DREDATE field and any fields names ending with _DATE are converted to a standard date format using the getDateFormats function.

Inputs: xml (XmlDocument), document (Document)

Outputs: metadata (table:{string->string} or table:{string->list:{string}}) See also: getDateFormats, on page 102

Example

function myPrefix_getEntryMetadata(xml, document)

return { FieldName1="FieldValue1", FieldName2={"FieldValue2", "FieldValue3"} } end

[prefix_]getEntryMetadataXPaths

Use this function to return field names mapping to XPaths referencing the field values. DRECONTENT can be used as the field name when assigning into the document content. The DREDATE field and any field names ending with _DATE are converted to a standard date format using the getDateFormats function.

Inputs: xml (XmlDocument), document (Document)

Outputs: xpaths (table:{string->string} or table:{string->list:{string}}) See also: getDateFormats, on page 102

Example

function myPrefix_getEntryMetadataXPaths(xml, document) cache_result()

return { TITLE="~/atom:title", LINK="~/atom:link[@rel=alternate]/@href" } end

[prefix_]getEntryHashData

Use this function to return any data from the entry that should be checked for modifications before an entry is ingested. If the data has not been changed for the entry, no further processing of the entry is performed.

Inputs: xml (XmlDocument), document (Document) Outputs: hashData (list:{string})

See also:

Example

function myPrefix_getEntryHashData(xml, document) return {

xml:XPathValue("~/version"), xml:XPathValue("~/size") }

end

[prefix_]getEntryContentUrl

Use this function to return the content URL to be downloaded and imported for the document.

Inputs: xml (XmlDocument), document (Document) Outputs: content URL (string)

See also:

Example

function myPrefix_getEntryContentUrl(xml, document) cache_result()

return "~/atom:content[starts-with(@src,http)]/@src"

end

[prefix_]getEntryContentUrlXPath

Use this function to return XPath representing the content URL to be downloaded and imported for the document.

Inputs: xml (XmlDocument), document (Document) Outputs: xpath (string)

See also:

Example

function myPrefix_getEntryContentUrlXPath(xml, document) cache_result()

return "~/atom:content[starts-with(@src,http)]/@src"

end

[prefix_]getEntryContentXPath

Use this function to return XPath representing the actual content that will be imported.

Inputs: xml (XmlDocument), document (Document) Outputs: xpath (string)

See also:

Example

function myPrefix_getEntryContentXPath(xml, document) cache_result()

return "~/atom:content"

end

Example Configure Script

The following example illustrates the configure_atom.lua script.

function getConfigOptions() return {

SSLMethod="SSLV23", }

end

function getNamespaces() return {

atom="http://www.w3.org/2005/Atom"

} end

function getDateFormats()

return { "YYYY-MM-DDTHH:NN:SS" } end

-- ################ FEED URL ################

function getFeedUrls()

local config = get_config()

local section = get_param("SECTION")

local url = config:getValue(section, "FEEDURL", "") return { url };

end

-- ################ FEED ################

function getFeedModifiedDateXPath() cache_result()

return "/atom:feed/atom:updated"

end

function getEntryXPaths() cache_result()

return { "/atom:feed/atom:entry" } end

function getNextUrlXPaths() cache_result()

return { "/atom:feed/atom:link[@rel='next']/@href" } end

-- ################ ENTRIES ################

function getEntryIdXPath() cache_result()

return "~/atom:id"

end

function getEntryModifiedDateXPath() cache_result()

return "~/atom:updated"

end

function getEntryMetadataXPaths() cache_result()

return {

ID="~/atom:id",

URL="~/atom:link[@rel='alternate']/@href", DRETITLE="~/atom:title[@type='text']", AUTHOR="~/atom:author/atom:name", PUBLISHED_DATE="~/atom:published", UPDATED_DATE="~/atom:updated", }

end

function getEntryContentUrlXPath() cache_result()

return "~/atom:content[starts-with(@src,http)]/@src"

end

function getEntryContentXPath() cache_result()

return "~/atom:content"

end

A

ACI (Autonomy Content Infrastructure) A technology layer that automates operations

ACI Server

A server component that runs on the Autonomy Content Infrastructure (ACI).

ACL (access control list)

An ACL is metadata associated with a

active directory

A domain controller for the Microsoft Windows operating system, which uses LDAP to authenticate users and computers on a network.

C

Category component

The IDOL Server component that manages

Connector Framework Server (CFS)

Connector Framework Server processes the

Content component

The IDOL Server component that manages the data index and performs most of the search and retrieval operations from the index.

D

DAH (Distributed Action Handler)

DAH distributes actions to multiple copies of IDOL Server or a component. It allows you to use failover, load balancing, or distributed content.

DIH (Distributed Index Handler)

DIH allows you to efficiently split and index

I

IDOL Proxy component

An IDOL Server component that accepts

Intellectual Asset Protection System (IAS) An integrated security solution to protect your

License Server

License Server enables you to license and run multiple IDOL solutions. You must have a License Server on a machine with a known, static IP address.

O

OmniGroupServer (OGS)

A server that manages access permissions for your users. It communicates with your repositories and IDOL Server to apply access permissions to documents.

P

primary domain controller

A server computer in a Microsoft Windows domain that controls various computer resources. See also: active directory, LDAP.

V

View

An IDOL component that converts files in a repository to HTML formats for viewing in a Web browser.

W

Wildcard

A character that stands in for any character or group of characters in a query.

X

XML

Extensible Markup Language. XML is a language that defines the different attributes of document content in a format that can be read by humans and machines. In IDOL Server, you can index documents in XML format. IDOL Server also returns action responses in XML format.

client is configured on this system, click the link above and an email window opens with the following information in the subject line:

Feedback on Administration Guide (Micro Focus Weibo Connector 12.0) Add your feedback to the email and click Send.

If no email client is available, copy the information above to a new message in a web mail client, and send your feedback to [email protected].

We appreciate your feedback!

In document Weibo Connector. Software Version Administration Guide (Page 109-121)