Using SharePoint Content Classification Rules to apply managed metadata

 

Content classification rules can help to assign managed metadata automatically to SharePoint documents and items. Once defined in the SharePoint Term Store they can be applied in background using timer jobs, for bulk-tagging of a selected library, or in real-time during document upload or change.

 

 

​The Layer2 Auto Tagger offers two different rule-based content classification engines. V1 allows simple expressions and is used by default (if nothing else is specified). V2 offers advanced logical expressions as well as expressions against specific document properties like the title, or author.

​Layer2 Default Content Classification Engine for SharePoint (V1)

 

The way that the default classification engine works is that it creates one joint text body from the document content (retrieved by IFilters), as well as additional SharePoint columns, the list or library name, and URL. By default (if no rule is created), a term will be assigned to a SharePoint item or document if the term label or synonyms are found in the content.

 

To increase the precision of the metadata assignment, content classification rules can be added to a term using the Layer2 Taxonomy Manager. You can define rules as logical expressions with the following keywords: OR, AND, NOT. In this way, documents and items can be found that contain some specific required tokens, but not others. You can also use regular expressions (REGEX) to include or exclude specific patterns.

 

Assign-SharePoint-Managed-Metadata-Rule-Based-1.png

Fig.: Adding a content classification rule to the SharePoint term "jdoe" using the Layer2 Taxonomy Manager. 

 

 

For Example:

 

To assign the term “jdoe” to an item you can make use of the following expression:

 

“jdoe” OR “John Doe” OR “J.Doe”

 

The term “jdoe” is assigned if the expression returns “True”.

​Layer2 Advanced Content Classification Engine for SharePoint (V2)

 

​The second version of the classification engine makes it possible to create checks for the value of a single items property. Below is a list of the properties which can be integrated into queries and also a list of the operators which can be used to check the value against an expected value. These expressions can be combined and concatenated with the logical operators NOT, AND and OR and encapsulated with brackets like in mathematics. Properties are identified by the internal name followed by a “Field”, e.g. TitleField.

 

Assign-SharePoint-Managed-Metadata-Rule-Based-2.png

Fig.: Expressions as used in the ​Layer2 Advanced Content Classification Engine for SharePoint (V2)  

 

 

Aliases are shortcuts for some rather complex internal names and some virtual fields added by the KMS. The following aliases are currently implemented:


Property Title
DescriptionInternal Name
TitleThe Tite columnTitle
NameThe Name column 8with file extensions, whoch is not shon in the default view)FileLeafRef
AbsoluteUrlThe URL to the documentEncodedAbsUrl
RelativeUrlThe folder path relative to the root web (e.g. /mySite/myLibrary/myFolder)FileDirRef
CreatedBy
The created By column (e.g. 2;#myUser)Author
CreatedThe Creater ColumnsCreated
ModiefiedBy
The Modified By column (e.g. 2;#myUser)
Editor
ModifiedThe Modified columnModified
File Type
An internal field not visible on the web UI, containing the file extension (e.g. text)
File_x0020Type
FilePathAn internal field not visible on the web UI, containing the file path relative to the root web (e.g. /mySite/myLibrary/myFile.txt)
FileRef
WebTitleThe title of the current web
Virtual Field
FileContentThe document content as retrieved by IFilters
Virtual Field
AllContentThe joint content of all properties (to be compatible with the default engine)
Virtual Field
AttachmentNamesThe names of all files attached to the current itemVirtual Field
AttachmentContentThe content of all files attached to current item
Virtual Field

 

Fig.: Using aliases to access item properties in SharePoint content classification rules. 

 

 

Example:

 

[Title] CONTAINS “By John Doe“ OR [Author] CONTAINS “jdoe”

 

The term “jdoe” is assigned only if “By John Doe” is found in the Title column (not case-sensitive) or “jdoe” is the author.

Next Steps

 

​You can register for download at the Layer2 Knowledge Management Suite for SharePoint product page. If you have any questions, please contact [email protected] directly.

READY TO GO NEXT STEPS?

Icon for Product Regsitration - Layer2 leading solutions

Register for free download.

Keep your Sharepoint in sync. Download and try today.

Contact Us Icon for Layer2 leading solutions

Questions? Contact us.

We are here to help. Contact us and our consulting will be happy to answer your questions.