Below I’m sharing some options – how to find files or documents in SharePoint, Teams or OneDrive that are older than some specific date. Why you might need that? To avoid content deletion as a result of retention policies in action.
If an organization is mature enough – it has data lifecycle policies established. If so – these polices must be applied to information stored in Microsoft 365 via retention policies. Retention policies are configured under Compliance center, but in particular applied to documents stored in SharePoint and OneDrive. Policies might dictate to retain documents or delete documents. Let say your organization is implementing retention policy that is configured to delete documents if 5 years passed after the file was last modified. That literally means all your files modified more than 5 years ago will be deleted and you will not even notice it. So – what if you want to know – which documents in your OneDrive or SharePoint site are older than 5 years?
Search in SharePoint with query parameters (GUI)
At any level of your site hierarchy – root level, library, folder etc. – you can refine your search results specifying properties values, e.g. document author or document created date or document last modified date. For last modified date the property is “LastModifiedTime”, e.g. here I’m in the SharePoint site document library:
If I put in search box query “LastModifiedTime<2023-07-15” I’ll get only documents older than July 15 2023:
There is property “LastModifiedTime”, and there is also property “LastModifiedTimeForRetention” you can use to detect documents your retention policy works against.
When you issue query with just “LastModifiedTimeForRetention<2023-06-15” you get as results all kind of SharePoint content – including pages, libraries, folders etc. If your concern is to avoid specific documents deletion as a result of retention policy – you’d probably be interested in finding documents only and do not want folders (as retention policy applies to all files in all document libraries), e.g.
If you need only Microsoft Word documents older than some specific date (e.g. June 15, 2021), you might use query: “*.docx LastModifiedTimeForRetention<2021-06-15”
For Microsoft Word and Excel documents older than June 15, 2021 – you’ might ‘d use query: “(*.docx OR *.xlsx) LastModifiedTimeForRetention<2021-06-15”
If you need only Microsoft Word documents authored by some specific User and older than some specific date, you might use query: “*.docx author:Patti LastModifiedTimeForRetention<2021-01-01”
Search in Teams
You can successfully use refinements to search for the same in Teams. But you’d select “Files” tab for better experience:
Microsoft is constantly updating this product, so your experience might be different. Note also that when you search in teams – you search through all sites you have access to.
Search for old documents in OneDrive
You can use the same technique – putting “LastModifiedTime<2023-07-15” in search bar in OneDrive. In some ways it’s even better, as you can
search for files in all sites (not only your personal OD site)
select multiple file types you are interested in
Search with Graph API
The same query you can use to search content with Microsoft Graph API. Here is the code example:
So far some findings I came up with during last Microsoft forms troubleshooting… I’ll keep them here just as a reminder for myself or it might help you to troubleshoot Microsoft forms.
You know, a user can create a form. Then user can share it. There are two kind of links –
to respond
to edit/view/export results
Link to respond is kind of : https://forms.office.com/Pages/ResponsePage.aspx?id=FHPcfQGf1UWwEnFmW7HFRMgvShgV5J1Phpi7J1M_UoVUOUI1TzNQUEdWOTAzVVdRUVYzVVg4MlhZNC4u or short one: https://forms.office.com/r/kDKaHWauj7
Link “to collaborate” -e.g. with the link a person can edit and view results – is created under … “Create or duplicate”, and could be for anyone, for all people in org, and for specific people in org
if the link looks like "https://forms.office.com/Pages/DesignPageV2.aspx?subpage=design&FormId=<FormId>" then it’s for specific people in org
if the link looks the same but also contains "&Token=e3cd16ccf8034a3e868c68747e1f9584" then it’s for anyone with work or school account or for anyone in the organization
The one with the “edit” link can edit the form (including questions, answers options, and form visibility , view responses, delete responses, create a “summary link”, create a duplicate link, and export responses to excel (“Open in Excel” button). But cannot change collaboration options.
When user complete the form (after submit button), there is an option “Save my response” – if so – user will see this for with only one (his/her) response under forms app.
Collaborator is not seeing the form he/she has access to until follow the link.
Form owner can move the form to a group. If so:
people who are group members (not only owners) will see this form under forms app – under specific group
form id will be changed, i.e. old links will stop working group-owned form id seemed to me little longer – 88 characters vs 80 for individual-owned forms and has no dashes.
The trick Tomasz Szypula @toszypul shared here (also citing the trick below) on how to find form owner having just “collaboration” link works like a charm! Even for deleted owner`s IDs.
If the form is owned by group – the link would be similar, but with “/group/<groupId>” instead of “/user/<UserId>” . E.g. here: https://forms.office.com/formapi/api/7ddc7314-9f01-45d5-b012-71665bb1c544/groups/65714e55-87f4-49c3-b790-fc75d7349c8a/light/...
you can see “65714e55-87f4-49c3-b790-fc75d7349c8a” which is group Id.
Deleting user who owns forms
When a form owner user account is deleted from AAD… tbp…
Deleting a group that owns forms
When a form owner group is deleted from AAD… tbp…
Audit log events
ListForms – Listed forms – viewed forms home page with list of forms
ViewForm – Viewed Form –
ViewRuntimeForm – Viewed response page
ViewResponses- Viewed responses
CreateResponse – Created response
ExportForm – Exported form – “export to excel” – file saved to the local machine (form owner=user)
ConnectToExcelWorkbook – Connected To Excel Workbook – “export to excel” – file saved to the teams SharePoint site under Documents (form owner = group)
toszypul replied to Jason_B1025
Jan 03 2022 03:17 AM - edited Jan 03 2022 03:18 AM
@Jason_B1025 I was able to get the ID of the user with a bit of a hack. Here are sample steps:
-Access the form using this designer direct URL https://forms.office.com/Pages/DesignPage.aspx?origin=shell#FormId=<YourFormID>
-Inspect the network traces. You will find a request similar to this
https://forms.office.com/formapi/api/72f988bf-86f1-41af-91ab-2d7cd011db47/users/e5351c57-d147-418e-89ab-3a3d50c235b6/light/forms('v4j5cvGGr0GRqy180BHbR1ccNeVH0Y5Bias6PVDCNbZUOUg4TkZJUEswSVQ1ODhNNkpHVVlMMldPTi4u')?$select=id,...
-The ID in bold is the AAD ID of the user
-Use Graph Explorer - Microsoft Graph to run this request to retrieve the username and email address of the owner https://graph.microsoft.com/v1.0/users/<UserID>
How do I know – is it a person-owned or group-owned form
Let say you got a claim that “we were able to work with the form, and now it is gone”, and the only you have is the “collaborators” link to the form – so you can edit form, view responses etc. but nobody knows who created that form… So how to determine who owns the form – person or group and what person/group.
It is a form owned by person if
form id is 80 characters length
on “Export to Excel” button – it saves/downloads excel file to the file system
“Export to Excel” button generates ExportForm – “Exported form” event in the audit log
You are seeing messages “This form can’t be distributed as it is asking for personal or sensitive information. Contact your admin for assistance. Terms of use”
or
“Form can no longer be accessed. This form has been flagged for potential phishing. Technical details”
Cause
The reason is: Microsoft enabled automated machine reviews to proactively detect the malicious collection of sensitive data in forms and temporary block those forms from collecting responses. More about it.
Solution
Ask your tenant global or security admin to go to the Microsoft Security Administration (Defender) Alerts:
If your list of alerts is too big – use filter by Policy: “Form blocked due to potential phishing attempt”.
To unblock the form or confirm it is phishing – admin should open the alert:
And then click “Review this form“. “Review the form” opens the page “https://forms.office.com/Pages/AdminPhishingReviewPage.aspx?id=” where is the form Id.
Then global/security admin can review the form and unblock it or confirm it is phishing:
There is a known problem in SharePoint (and Teams*) – complicated permissions system. Site owners/administrators provide access, site contributors upload documents and nobody knows – who has access to their sites. As a result – sometimes sensitive documents become overshared (over-exposed).
The biggest concern is when sites content is shared with “Everyone”. How do we find sites shared with “Everyone” in a large Microsoft 365 environment so for this sites we can request permissions review?
(*) Microsoft with the introduction of Teams had to simplify permissions in SharePoint – since there should only be 3 types of access levels – owner, member and visitor. It was… in some ways, but in other ways it made things worse.
Solution #1 (3-rd party tools)
You are lucky if you can use 3-rd party tools (e.g. ShareGate, SysKit Point, AvePoint, Metalogix etc.), with the ability to get full permissions report. Though there might be a problem to get full permissions report for all tenant sites if your m365 environment is not small. Some tools allow you to get tenant-wide permissions report for specific Ids – this option should work better for large environments.
Still there might be another problem. Consider the following. When I say “shared with Everyone” – I actually mean at least 3 possible “everyone” system logins:
Everyone
Everyone except external users
All users
– those are system id’s, but what if there are other ids – e.g. migrated from on-prem or cloud-born custom security or Microsoft 365 groups in tenant that also includes everyone (e.g. dynamic security group that includes all org accounts)? How do you think this group will be identified as “Everyone” group? So – you’d also know which groups include “all” or “almost all” users and get report for these groups also.
Obviously this option #1 is not free, as it requires licenses to be obtained. And still it worth to consider option 3.
Solution #2 (PowerShell “Brute force”)
You can get full permissions report per site or for entire tenant with PowerShell, which if free… The only you need is to write a script yourself or find existing one. Sounds easy?
Well, first problem is it takes a decent amount of time and competences to write such script. If if you find one – it would require some skills to adopt it and run. Frankly say, I have not seen so far scripts that are ready OotB to do that job. And it is not a good idea to run scripts you are not fully confident with against production environment.
Another possible problem – size of environment. The script I designed and use to get comprehensive permissions report might run hours against a good site – if I need full details on site/subsites, lists/libraries, folders and list items levels. So if you have less than 1000 sites – probably this approach can fly. But if your environment is 10K+ sites – it will take forever. So the approach might not work for large enterprise environments.
One might say – we can limit report with root web permissions only to get it faster. But this would not be accurate. And what is not accurate in the IT security – lead to even bigger risks. So, we need check permissions up to every item level deep, as even one file with sensitive info shared inappropriately can cause security issue. (Btw, 3-rd party tools usually by default limit reports to libraries level, so check reporting options)
The other issue with this approach… Let say you got full permissions report… It would look like “resource -> group -> permissions”… How do you know for each group – what is the group in terms of membership?
Ok, if this solution is not easy to get working – what are other options?
Solution #3 (Search-based)
This solution is based on simple but clever idea: why do we need to iterate through all the tenant documents/items if all the content is already crawled by search? Search is also respect permissions. Can we just use search to get files shared with Everyone? Let us see.
What if we use some dummy user account with no specific permissions provided and no group memberships and try to search content on behalf of that account. The idea is if this user can see some data – then these data is open to everyone.
With this option we would use search query “*” and all 5 possible SharePoint entities – driveItem’,’listItem’,’list’,’drive’,’site’ to find everything that is shared with everyone. We’d pull results with paging (we’d use “from” option in a loop to pull all results). After we get all results – we’d select only unique site collections. But! We might have some problems here.
Problem #1. Again, for small environments or if there are not much “Open” sites – it would work. But for large enterprise environments the problem is the same as in “brute force”. Search would returns too many results – and it might take weeks (exact time is unpredictable) to get all of them. (Surely there are sites “legally” shared with everyone, public Office 365 group based sites, communication sites… So your search will be flooded with content from sites you already know are shared with all).
Problem #2. We are getting results with paging. But recently Microsoft started limiting number of returning results. E.g. your search request result might say like “there are 3659735 total hits” but after result number 1000 it just stops returning anything, even with paging.
Solution#3 Option #2 – loop sites
The idea is: why do we need to get all search results if even one result from a site would be enough to put the site to the list of “open” sites. In other words, we do not need all results from the site, we only need to know if there are any results from the site, at least one – so we know if the site is open for everyone or not.
So, consider the following approach:
You get list of all sites in tenant.
You run search request against each site in the loop (e.g. consider KQL option “Site: https://yourTenant.SharePoint.com/sites/YourSite”. If at least something found in the site – add the site to the “Open Sites” list. With this approach you will get list of sites shared with “Everyone…” in a predictable time.
Solution#3 Option #3 – exclude known “open” sites
There are sites “legally” shared with everyone – e.g. corporate portal, department communication sites, public teams, public Viva Engage communities etc. If it is know that these sites are public – you can exclude them from all sites list – so in the “Solution#3 Option #2 – loop sites” – you’d loop only through sites that are not supposed to be public. I know – percentage of “legally public” sites in tenant to all sites is a relatively small number, so should not significantly decrease elapsed time… but still.
Pros and cons of the Solution # 3
Pro: the only fast enough (at least predictable time to complete) and accurate enough to rely on solution.
Con 1 : crawling and indexing takes time, so search-based reports can miss recent changes in data and permissions
Con 2: this approach cannot be automated (since we need an interactive authentication). I.e. we need to run it manually every time.
Con 3: After we get all sites shared with everyone – we do not know – at what level permissions are broken and provided to everyone. It might be entire site or one file. If you want to know what exactly is shared with everyone – run permissions report against this shortlist.
Notes
Note 1: consider there are resources like “Styles Library” shared with everyone by default, especially on migrated sites
Note 2: There are might be security groups intended to hold all or part of the enterprise (e.g. “All employee” or “all contractors”). If the enterprise comprises from several businesses or regions – it might be “All Business 1” or “All EMEA”… you got the idea. You can modify search-based solution if you add your dummy account you are running search on behalf of to some of theses groups to find out if there are resources shared maybe not with everyone but with all “North America based” users or with “all employees”, which might make sense also.
Note 3: separate service, but consider implementing/using sensitivity labels. At least you can start with high-sensitive sites. With sensitivity labels – site owners/member would know – what kind of site they are working on.
What’s next
Ok, we know list of SharePoint resources shared with everyone, but what would be the next step? Should we communicate to site owners – if so how to let site owners know that there are resources shared with Everyone… on their sites. To be continued…