There is a known problem in SharePoint – complicated permissions system. Site owners/administrators provide access, site contributors upload documents and nobody knows – who has access to their sites. As a result – sometimes sensitive documents become overshared (over-exposed).
The biggest concern is sites content shared with “Everyone”. How do we find sites shared with “Everyone” in a large Microsoft 365 environment?
NB. When I say “shared with Everyone” – I actually mean 3 possible “everyone” logins:
- Everyone except external users
- All users
Approach #1 (Brute force)
We can get full permissions report at tenant level (or permissions provided only to “Everyone”). There are 3-rd party tools (e.g. ShareGate, SysKit, AvePoint, Metalogix etc.), or you can run PowerShell script…
Sounds easy? Well, if you have less than 1000 sites – probably it will work. But if your environment is 10K+ sites – it will take forever. Permission report might run hours for an average site with site/subsite, list/library and list item details level. So the approach will not work for large enterprise environments.
One might say – we can limit report with root web permissions only to get it faster. But this would not be accurate. And what is not accurate in the IT security – lead to even bigger risks. So, we need report detailed up to every item level deep, as even one file with sensitive info shared with everyone can cause security issue. (3-rd party tools usually by default limit it to libraries level.)
Ok, if this approach is not really working – what’s working?
Approach #2 (Search)
Clever idea: why do we need to iterate through all the tenant documents/items if all the content is already crawled by search? Search is also respect permissions. Can we just use search to get files shared with Everyone? Let us see.
What if we use some dummy user account with no specific permissions provided and no group membership and try to search content on behalf of that account. The idea is if this user can see data – it means that data is open for everyone.
Check this and this articles. Can we get results programmatically (e.g. with PowerShell)? Can we use Microsoft Graph search API? Sure.
Check this article “How to search against SharePoint Online Content with Microsoft Graph search API with PowerShell”.
But! We have some problems here.
Search Problem #1. Again, for small environments or if there are not much “Open” sites – it would work. But for large enterprise environments the problem is the same as in “brute force”. Search returns too many results – it’ll take weeks to get all of them. (There are team sites “legally” shared with everyone, public Office 365 group based sites, communication sites… ).
Search Problem #2. Even if we get all search results – we do not know – at what level permissions are provided to everyone. So we will need to build list of sites based on the search results – ant then still need to run permissions report against these sites.
Search Problem #3. We are getting results with paging. But recently Microsoft started limiting number of returning results. E.g. your search request result might say like “there are 3659735 total hits” but after result number 1000 it just stops returning anything, even with paging.
Approach # 3 Hybrid
The idea: why do we need to get all search results if even one result from a site would be enough to add the site to the list of sites require permissions review. In other words, we do not need all results from site, we only need one to know the site is open.
So, consider (imho, the best) approach (Solution):
- You get list of all sites in tenant.
- You run search request against each site in the loop
(e.g. consider KQL option “Site: https://yourTenant.SharePoint.com/sites/YourSite”.
If at least something found in the site – add the site to the “Open Sites” list.
With this approach you will get list of sites shared with “Everyone…” in a couple of minutes.
- Run permissions report against this shortlist
Note: consider there are resources like “Styles Library” shared with everyone by default.
Note: You can refine the list you get at step 1 – e.g., excluding sites connected to public teams or known communication sites…
Note: consider implementing sensitivity labels. At least you can start with high-sensitive sites. Site owners/member will know – what kind of site they are working on.
Pros and cons of the Approach # 3 Hybrid
Pro: the only fast and accurate enough to rely on
Con 1 : crawling and indexing takes time, so search-based reports can miss recent changes in data and permissions
Con 2: this approach cannot be automated (since we need an interactive authentication).
How to communicate to site owners
The Next step would be “How to let site owners know that there are resources shared with Everyone… on their sites”.