Our Desired Outcome
In this blog post we will accomplish the below goals:
- Outline What problem does Graph Connector solve
- Explain Why you should use a Graph Connector and How
- Explain what External Data is and how to create it
- Explain what an External Data Schema is and how to create it
- Explain what an Indexer component is and how to create it
- Show you how to surface this External Data that’s have been indexed and ACL’d to your users
- Outline next steps and a Call to Action (CTA)
Problem Statement
Why should you consider a Microsoft Graph Connector? What problem is it solving for you? Is the the only way to solve this problem? These are questions that I want to answer from the beginning of this journey we are about to take.
- Primer: My organization has data that is distributed, which could be solely on premises, a combination of on premises and in the cloud (SaaS), or just multiple SaaS providers and you also have/support Microsoft Office 365 as part of that same network of available services.
- Scenario 1: One main problem here is in the ‘findability’ or better yet ‘discoverability’ of information i.e. where do I look for its, how do I find it?
- Scenario 2: My organization is concerned about Security and Compliance i.e. DLP, Legal Holds and based on the primer above, my problem is how do I narrow my gaze and still ensure that I am casting a wide net against all the data I need to manage and/or secure.
- Scenario 3: My workforce is distributed could be just ‘how we work’ or it could be a fact of life as we live it now under a pandemic. Sprawl can be a problem when it comes to data which can affect not only performance but crate anxiety, how do I make sure that my workforce have access to the right data at the right time
There are probably many other scenarios you can think of in this problem statement but for me, these are top of mind. If you do have more, please leave me comments or ping me on twitter @fabianwilliams
Why should you be interested in Graph Connectors
Simple, Graph Connector is an answer to solve the problem statements above. How does it do that? By creating a connection between your data sources, adopting a schema for the data that will be made available, making a determination on who should have access to that data once surfaced and keeping that information fresh in Microsoft 365. This may be accomplished through the Microsoft 365 Search Admin Center and/or the Microsoft Graph API.
Lets look a bit deeper under the covers though. As taken from our docs “With Microsoft Graph connectors, your organization can index third-party data so it appears in Microsoft Search results. This feature expands the types of content sources that are searchable in your Microsoft 365 productivity apps and the broader Microsoft ecosystem. The third-party data can be hosted on-premises or in the public or private clouds.” Connectors are build both by Microsoft as well as our 3rd Party partners, a list of them may be found in our connectors gallery here.
In this post we will focus our attention on what may be accomplished via the Microsoft Graph API, in particular what you are able to do under the Search Endpoint, through Indexing which is currently still in beta.
Tell me more
I will have samples for my Step by Step which will guide you through the details below:
- External Connection
- Schema
- Creating the External Item (Index)
- Security Model / Access Control List (ACL)
- Surfacing the External Data
- Office Hub
- SharePoint Online
The next sections will go into the detail above where I will tell the story around each item. What it is, why do you need it, how do you use it, as well as a sample that you should be able to just pick up and run with as I have done here in my demo. The location for this code is found in my Github repository here. I also have a public Github Gist bundle with all the files here, but I will be calling them out independently as we go.
At the time of this writing Microsoft Graph connectors and are currently in public preview status. To gain access to connectors functionality, you must turn on the Targeted release option in your tenant. See more details on the connectors preview program.
There are also some known limitations that I will link to here, nothing too crazy just limits to amount, size, security, and sorting.
The Set Up
Here is our scenario and story…. Al long time ago.. JK…
You are a Organization or an ISV that have data that spans On Premises and possibly Cloud, in addition you are also using Microsoft 365. I am imagining that you have a data store that you can liken to a SQL table, Oracle table, NoSQL JSON container, or even a SaaS [Salesforce, Oracle NetSuite, ServiceNow, Jira] dataset. What I am doing here for the sake of simplicity and so that this can be repeatable is use a format that you can all work with in a demo, that is… a comma separated value (CSV) file with a header row that represents the properties or fields, and rows of data. In our case we are going to use a freely available Product data set from Kaggle for FlipKart.
Our Data
Our data is neatly tucked away in my GitHub Repo Here
and a view of what it looks like is above.
Our User(s)
For our scenario we are are setting up ONLY ONE ACL, you will see more of this in the Security section below and in our test when we show another user performing a Search you will notice the results honoring the security we specified.
Our Use Case
Our use case is rather simple, and it anchors around different sources you can target with Graph Connectors, in our case Files, see our guidance here for others. We want to surface our Data that is external to Microsoft 365 to be treated JUST LIKE Microsoft 365 data i.e. 1st Party == 3rd Party such that I have only one place to go to get a 360 view of my Data regardless of its origin.
Register your App (for your Graph Connector) in Azure Active Directory (AAD)
The 1st step is to register your App as you would any other Graph App Registration, paying attention to the following Graph Permission needed
External Connection
An External Connection is a logical container to add content from an external source into Microsoft Graph from which you are able to:
- Create a Connection
- Read/List a Connection
- Update a Connection
- Delete a Connection
So full CRUD capabilities, in this section which you can see more guidance from Microsoft here, we will show you how to create said connection. Pay attention to the ID you will create with this HTTP POST call as you will need it for other calls related to this connection and data.
Below you will see how to create that connection with POST /external/connections
{ "name": "FabsDemoFilesDriveAlpha", "description": "Index my hard drive network share demo", "id": "fabsdemofilesdrivealpha" }
that will call https://graph.microsoft.com/beta/external/connections endpoint
Below is a screenshot of the call that I did using Postman against the same endpoint
and at this point if you went to the Search Admin Center, this is what you will see
what you have above is a connection now created inside the tenant you have your Application Registered in currently in Draft mode.
Schema
Now that you have the connection, the next thing you will do is define and create the schema that you want applied to the External Data that you are going to be surfacing. This can be as I tell my partners and customers:
- The full Entity object representation of what you want to surface i.e. everything
- A partial representation of what you want to surface i.e. perhaps you want to just send JUST what your users will need and for everything else, have them click a link to either use an immersive experience inside a Microsoft 365 technology product or open up a modal, or pop them back into a web browser or application view.
- Also an entity may be a “REAL” thing, like a file, or it can also be an abstraction, something that is arbitrary or has not file type but exist in your world. Consider inside your product you may have widgets, but these widgets are not files but you want to surface the information inside the widget
For your call to create the Schema its also a POST /external/connections/{id}/schema
{ "baseType": "microsoft.graph.externalItem", "properties": [ { "name": "uniqid", "type": "String", "isSearchable": "false", "isRetrievable": "false", "isQueryable": "false" }, { "name": "producturl", "type": "String", "isSearchable": "false", "isRetrievable": "true", "isQueryable": "false" }, { "name": "productname", "type": "String", "isSearchable": "true", "isRetrievable": "true", "isQueryable": "false" }, { "name": "retailprice", "type": "String", "isSearchable": "false", "isRetrievable": "true", "isQueryable": "false" }, { "name": "discountedprice", "type": "String", "isSearchable": "false", "isRetrievable": "true", "isQueryable": "false" }, { "name": "image", "type": "String", "isSearchable": "false", "isRetrievable": "true", "isQueryable": "false" }, { "name": "description", "type": "String", "isSearchable": "true", "isRetrievable": "true", "isQueryable": "false" }, { "name": "brand", "type": "String", "isSearchable": "false", "isRetrievable": "true", "isQueryable": "true" } ] }
and that will call against the ID previously created. Please see my screenshots below:
This 1st item is where I am getting ready to make the call
This 2nd item is the result from the call pay note to the Location Property in the Header that is used in the GET and how it shows you the status of the Schema operation
Once this action is completed and it can take as in my case approximately 3 minutes to do, then we are two thirds done and now we just need to PUT (push the data from our local system up to Microsoft 365). Before doing this however you would have had to consider your security model because it is inside of the Index which is what we are about to create that you define your ACL, more to come when we talk about security, logically you would have done this before but I felt it easier to show you something then explain more in details afterwards.
The call would look like this: PUT /external/connections/{connection-id}/items/{item-id}
Lets unpack what you are seeing here and explain what this is verses what you will be doing in real life.
Demo Scenario
In this demo scenario I am imagining a data store that you can liken to a SQL table, Oracle table, NoSQL JSON container, or even a SaaS [Salesforce, Oracle NetSuite, ServiceNow, Jira] dataset. What I am doing here for the sake of simplicity and so that this can be repeatable is use a format that you can all work with in a demo, that is… a comma separated value (CSV) file with a header row that represents the properties or fields, and rows of data.
What you see above is using Postman Runner which is a tool that you can use to automate, stress test, or basically run a command as in this case a REST call by feeding it the fields values as variables and a file containing the data to iterate through. In the case above you can see multiple PUT calls to the endpoint with a unique ID, and 200 responses being returned. Next I will open one up and show you what’s inside a payload.
and finally if we open it up you can see in line 3 through 9 the Security that’s bound to this indexed item, and from lines 11 through 19 the schema along with associated data we are pushing up to Microsoft 365 from our local environment i.e. the External Data.
Real Life
If this were real life let me bullet point what would be happening here:
- You have an external system that is your data store/source
- When changes happen in that system you will have a record of it and you can make #2 below event based or batched at end of day
- Based on some frequency you will take that data that’s new or updated in 1.1 above and its either in-memory or persisted elsewhere but you can either
- In real time store those changes as variables and then pass them to PUT /external/connections/{connection-id}/items/{item-id} as you see in the above screenshot where {item-id} is that unique identifier for the row that’s in your HTTP PUT while you send the properties and their values in the Message Body.
- Batch this at end of day which could actually make my DEMO scenario be a real life scenario, if you persisted the information in an external file and just read from it.
Regardless of the approach you use, what you will see if you look in the Admin Center now would be the following
which shows you that items are being indexed into Microsoft 365.
Security
When creating an externalItem (Index), the following fields are required: @odata.type, acl, and properties. The properties object must contain at least one property.
Property | Type | Description |
---|---|---|
acl | acl collection | An array of access control entries. Each entry specifies the access granted to a user or group. Required. |
content | externalItemContent | A plain-text representation of the contents of the item. The text in this property is full-text indexed. Optional. |
id | String | Developer-provided unique ID of the item within the containing externalConnection. Must be alphanumeric and a maximum of 128 characters. Required. |
properties | Object | A property bag with the properties of the item. The properties MUST conform to the schema defined for the externalConnection. Required. |
The ACL itself must be either AAD or External, I think the AAD portion is easy, and usually when I explain the External ACL, I say imagine if you do not use AAD for your Identity Provider (IdP) ? What you would need to do is Map a relationship between your External Item Index to the security model in AAD in order to honor the security context. If this is AAD, then this can be a AAD User or Group as you will see in my demo. If you would like to get more details on this please reference this guidance here for External Item and here for ACL.
In our Demo example my Index (External Item) is shown below
{ "@odata.type": "microsoft.graph.externalItem", "acl": [ { "type": "user", "value": "your-user-GUID-here-347aac675901", "accessType": "grant", "identitySource": "azureActiveDirectory" } ], "properties": { "uniqid": "{{uniqid}}", "producturl": "{{producturl}}", "productname": "{{productname}}", "retailprice": "{{retailprice}}", "discountedprice": "{{discountedprice}}", "image": "{{image}}", "description": "{{description}}", "brand": "{{brand}}" }, "content": { "value": "Error in gateway...", "type": "text" } }
note in acl [2nd line] under type I have user and in the value, that is the GUID from my user, this can also be a Group as seen in the sample code below
HTTP/1.1 200 OK Content-type: application/json { "@odata.type": "microsoft.graph.externalItem", "acl": [ { "type": "user", "value": "e811976d-83df-4cbd-8b9b-5215b18aa874", "accessType": "grant", "identitySource": "azureActiveDirectory" }, { "type": "group", "value": "14m1b9c38qe647f6a", "accessType": "deny", "identitySource": "external" } ], "properties": { "title": "Error in the payment gateway", "priority": 1, "assignee": "john@contoso.com" }, "content": { "value": "Error in payment gateway...", "type": "text" } }
So that does it for security. Once the Index (External Item) is all up in Microsoft 365 and you take a look at the Admin Center again, I would expect it to look like the below
So, now we have almost one thousand items in our index…what’s next?
Surfacing the External Data
Most of this experience is documented in Microsoft Guidance and its not done in the API, so I don’t want to make this long post even longer, but for completeness I will show you some screenshots as I give you the docs here from Microsoft. The 1st thing you would have notices as the Index was in flight was some “Required Actions” let us go back to that screen shot.
So the 1st thing you will need to do is create a Vertical and after than you will need to create a Result Type. I can already hear you asking.. What is that?
Search Vertical
As taken from the docs here “At the top of the Microsoft Search results page, there’s a row of tabs. These are the search verticals. A search vertical only shows results of a certain type or from certain content. Examples are Files or News. By default, Microsoft Search shows the verticals All, People, Files, Sites, and News.
You can add search verticals that are relevant to your organization. These will appear on the Microsoft Search results page in SharePoint, Office, and Bing. For example, you could create a vertical for marketing-related content and another for sales, based on the type of information that each group needs. You can add verticals to show results only from content indexed via connectors.” In our case, our Vertical is this fictitious data we got from Kaggle about products.
Here are the steps to create a Search Vertical, all you will need to do different is use the connections we created here
Result Type
Again, as taken from the docs “You can define how results are displayed in the vertical by designing the layout using result types. The result layout lets you show important information directly in the search results, so users don’t have to select each result to see if they found what they’re looking for.”
Here are the steps to create a Result Type of your own.
My Demo Experience
In my demo experience we first created a Search Vertical which as you can see from the above is very straight forward.
and then followed a few steps that’s pretty simple to figure out, most of which are optional and serve to limit the data your get back in your results. The key below is picking the correct Connector your created earlier
Then its next next next Finish. Off to getting the Result Type, for that we follow the instructions above to create that and the steps are equally straight forward
and here again selecting the correct source
The screenshot below is where it does get a bit nuanced as you need to determine “HOW” you would like to see the results coming back. Fortunately there is a Layout Designer that when you click the button indicated #1 below, it will open up a designer that you can select from some pre defined templates. The template I chose is below
{ "type": "AdaptiveCard", "version": "1.0", "body": [ { "type": "ColumnSet", "columns": [ { "type": "Column", "width": 1, "items": [ { "type": "Image", "url": "{image}", "size": "Large", "horizontalAlignment": "Left" } ], "spacing": "None" }, { "type": "Column", "width": 9, "items": [ { "type": "TextBlock", "text": "[{productname}]({producturl})", "color": "Accent", "size": "Medium", "weight": "Bolder", "maxLines": 3 }, { "type": "TextBlock", "text": "{Description}", "wrap": true, "maxLines": 3, "spacing": "Medium" } ], "horizontalAlignment": "Center", "spacing": "Medium" } ], "spacing": "None" } ], "$schema": "http://adaptivecards.io/schemas/adaptive-card.json", "$data": { "description": "Marketing team at Contoso.., and looking at the Contoso Marketing documents on the team site. This contains the data from FY20 and will taken over to FY21...Marketing Planning is ongoing for FY20..", "image": "https://searchuxcdn.blob.core.windows.net/designerapp/images/long-stock-image.png", "producturl": "https://modernacdesigner.azurewebsites.net", "productname": "Contoso Research Memo" } }
and pasting it in below you see…
Once you have settled on a template you are given the option to copy the JSON content that represents your Adaptive Card and paste it in the space your see #2. I also draw your attention to #3 as it is basically the properties and data type that you have in your Index that marries up to your Schema.. or at least should. If it does not, it will let you know your JSON does not map. How do I know this? I had a capital letter in my property for my Schema and a lower case in my Index. and yes I paid dearly
So in the end when we go to two out of the 3 places mentioned above to consume the External Data and we do a search lets say for instance “women” in an attempt to find products for women:
Office Hub
Below you will see us inside the Office Hub, that is where you go hen you type in portal.office.com. We are conducting a search for Women in #2 callout and in #3 callout you can see we are under the FlipCartCatalog Vertical and we have our results sets showing up.
SharePoint Hub
Below you will see us inside the SharePoint Hub, that is where you go hen you type in portal.office.com. We are conducting the same search for Women in #2 callout and in #3 callout you can see we are under the FlipCartCatalog Vertical and we have our results sets showing up.
Show me how the ACL prevents Unauthorized Access
Of course… so any one EXCEPT my user should not gain access as you an see from the below user conducting a search
and in those simple steps we now have a working REAL WORLD Graph Connector Step by Step.
Resources
Overview of Microsoft Graph connectors
License requirements and pricing
Use the Microsoft Search API to index data
Summary
In our Demo scenario I created my Vertical and Result Type, it was pretty much straight forward, there was one GOTCHA however, I could never get my Thumbnail view in my Result Type to show the image even though I know it is coming through [you can verify by looking at the Index screen shot to see it there in the payload], I think its perhaps too big in size. When I go to the image directly it’s a full page image, but I did not confirm my suspicions.
Finally, I hope this was of value to you, I have provided you with enough information and hopefully detail that you can take this and run with it, but should you have questions, please let me know, the best way is to fire off a Tweet to @fabianwilliams and/or a message in my LinkedIn.