Below I’m sharing how to find old or outdated content in SharePoint, Teams or OneDrive site. Specifically, files or documents that are older than some certain date. Why you might need that? For example – to delete content to save space or opposite – to avoid content deletion as a result of retention policies in action (* see below for details).
Search in SharePoint with query parameters (GUI)
At any level of your site hierarchy – root level, library, folder etc. – you can refine your search results specifying properties values, e.g. document author or document created date or document last modified date. For last modified date the property is “LastModifiedTime”, e.g. here I’m in the SharePoint site document library:
If I put in search box query “LastModifiedTime<2023-07-15” I’ll get only documents older than July 15 2023:
There is property “LastModifiedTime”, and there is also property “LastModifiedTimeForRetention” you can use to detect documents your retention policy works against.
When you issue query with just “LastModifiedTimeForRetention<2023-06-15” you get as results all kind of SharePoint content – including pages, libraries, folders etc. If your concern is to avoid specific documents deletion as a result of retention policy – you’d probably be interested in finding documents only and do not want folders (as retention policy applies to all files in all document libraries), e.g.
If you need only Microsoft Word documents older than some specific date (e.g. June 15, 2021), you might use query: “*.docx LastModifiedTimeForRetention<2021-06-15”
For Microsoft Word and Excel documents older than June 15, 2021 – you’ might ‘d use query: “(*.docx OR *.xlsx) LastModifiedTimeForRetention<2021-06-15”
If you need only Microsoft Word documents authored by some specific User and older than some specific date, you might use query: “*.docx author:Patti LastModifiedTimeForRetention<2021-01-01”
Search in Teams
You can successfully use refinements to search for the same in Teams. But you’d select “Files” tab for better experience:
Microsoft is constantly updating this product, so your experience might be different. Note also that when you search in teams – you search through all sites you have access to.
Search for old documents in OneDrive
You can use the same technique – putting “LastModifiedTime<2023-07-15” in search bar in OneDrive. In some ways it’s even better, as you can
search for files in all sites (not only your personal OD site)
select multiple file types you are interested in
Search with Graph API
The same query you can use to search content with Microsoft Graph API. Here is the code example:
If an organization is mature enough – it has data lifecycle policies established. If so – these polices must be applied to information stored in Microsoft 365 via retention policies. Retention policies are configured under Compliance center, but in particular applied to documents stored in SharePoint and OneDrive. Policies might dictate to retain documents or delete documents. Let say your organization is implementing retention policy that is configured to delete documents if 5 years passed after the file was last modified. That literally means all your files modified more than 5 years ago will be deleted and you will not even notice it. So – what if you want to know – which documents in your OneDrive or SharePoint site are older than 5 years?
Note: if your content was migrated from previous SharePoint versions (e.g. your old SharePoint on-prem farm) using migration tools – it’s last modified date is most likely was preserved (for instance, if your 5-years old doc was migrated 1 month ago – it last modified date would be 5 years ago). So if you are going to migrate your existing on-prem SharePoint to Microsoft 365 site and there are retention policies implemented in m365 – your content might be deleted right after migration.
Video tutorial
How to fetch all SharePoint documents older than some amount of time:
So far some findings I came up with after several Microsoft forms troubleshooting sessions… I’ll keep all the gotchas here as “how to” guide for myself. I’d be glad if this also helps you troubleshoot your Microsoft forms.
Microsoft forms links
You know, a user can create a Microsoft form. Then user can share it. There are two kind of links –
to respond
to edit/view/export results
Link to respond is kind of : https://forms.office.com/Pages/ResponsePage.aspx?id=FHPcfQGf1UWwEnFmW7HFRMgvShgV5J1Phpi7J1M_UoVUOUI1TzNQUEdWOTAzVVdRUVYzVVg4MlhZNC4u or short one: https://forms.office.com/r/kDKaHWauj7
Link “to collaborate” -e.g. with the link a person can edit and view results – is created under … “Create or duplicate”, and could be for anyone, for all people in org, and for specific people in org
if the link looks like "https://forms.office.com/Pages/DesignPageV2.aspx?subpage=design&FormId=<FormId>" then it’s for specific people in org
if the link looks the same but also contains "&Token=e3cd16ccf8034a3e868c68747e1f9584" then it’s for anyone with work or school account or for anyone in the organization
The one with the “edit” link can edit the form (including questions, answers options, and form visibility , view responses, delete responses, create a “summary link”, create a duplicate link, and export responses to excel (“Open in Excel” button). But cannot change collaboration options.
When user complete the form (after submit button), there is an option “Save my response” – if so – user will see this for with only one (his/her) response under forms app.
Collaborator is not seeing the form he/she has access to until follow the link.
Move the form to a group
Form owner can move the form to a group (and this is strictly recommended for all production forms). If so:
people who are group members (not only owners) will see this form under forms app – under specific group
form id will be changed, so the long “respond” link will be different. Though the short link will be the same. All links should continue to work: Old and New long and short respond links. Group-owned form id seemed to me be little longer – 88 characters vs 80 chars for individual-owned forms.
The trick Tomasz Szypula @toszypul shared here (I’m also citing the trick below) on how to find form owner having just a link works like a charm! Even for deleted owner`s IDs.
But let me share some more here. If the form is owned by group – the link will be similar, but with “/group/<groupId>” instead of “/user/<UserId>” . E.g. here: https://forms.office.com/formapi/api/7ddc7314-9f01-45d5-b012-71665bb1c544/groups/65714e55-87f4-49c3-b790-fc75d7349c8a/light/...
you can see “65714e55-87f4-49c3-b790-fc75d7349c8a” which is group Id. So you can use the same trick to figure out what group owns a form.
Deleting user who owns forms
What if the original owner of a form is no longer with a company? How can I transfer ownership of the form?
If the employee account was deleted or disabled, the global administrator or office application administrator of the organization who have a valid Forms license can transfer for ownership within 30 days of when an account was disabled/deleted. See details. Note that all forms user owned will be transferred to an admin, then an admin can transfer forms to a group so new owners can have access to answers etc.
Deleting a group that owns forms
When a form is owned by group and the group is getting deleted… tbp…
Audit log events
You can get some ideas on the form from an audit log, including
is the form owned by group or by user
to whom the form was shared with collaborator link
Below are kinds of events related to Microsoft 365 forms:
ListForms – Listed forms – viewed forms home page with list of forms
ViewForm – Viewed Form –
ViewRuntimeForm – Viewed response page
ViewResponses- Viewed responses
CreateResponse – Created response
ExportForm – Exported form – “export to excel” – file saved to the local machine (form owner=user)
ConnectToExcelWorkbook – Connected To Excel Workbook – “export to excel” – file saved to the teams SharePoint site under Documents (form owner = group)
toszypul replied to Jason_B1025
Jan 03 2022 03:17 AM - edited Jan 03 2022 03:18 AM
@Jason_B1025 I was able to get the ID of the user with a bit of a hack. Here are sample steps:
-Access the form using this designer direct URL https://forms.office.com/Pages/DesignPage.aspx?origin=shell#FormId=<YourFormID>
-Inspect the network traces. You will find a request similar to this
https://forms.office.com/formapi/api/72f988bf-86f1-41af-91ab-2d7cd011db47/users/e5351c57-d147-418e-89ab-3a3d50c235b6/light/forms('v4j5cvGGr0GRqy180BHbR1ccNeVH0Y5Bias6PVDCNbZUOUg4TkZJUEswSVQ1ODhNNkpHVVlMMldPTi4u')?$select=id,...
-The ID in bold is the AAD ID of the user
-Use Graph Explorer - Microsoft Graph to run this request to retrieve the username and email address of the owner https://graph.microsoft.com/v1.0/users/<UserID>
My 3×5 cents to this clever trick:
Not only “collaborator” link helps, but also “respond to”
If the form is owned by the group – the link would be similar but with “…/group/group_id/…” instead of “…/user/user_id/…”
If you have a SSO in your org and cannot find this call under network – try different browser or incognito mode or logging out before the call – as what you need appears at early stages – even before authentication
How do I know – is it a person-owned or group-owned form
Let say you got a claim that “we were able to work with the form, and now it is gone”, and the only you have is the “collaborators” link to the form – so you can edit form, view responses etc. but nobody knows who created that form… So how to determine who owns the form – person or group and what person/group.
The form is owned by a person if
form id is 80 characters length
on “Export to Excel” button – it saves/downloads excel file to the file system
audit log contains ExportForm (Exported form) event – as clicking “Export to Excel” button generates ExportForm (Exported form) event
You are seeing messages “This form can’t be distributed as it is asking for personal or sensitive information. Contact your admin for assistance. Terms of use”
or
“Form can no longer be accessed. This form has been flagged for potential phishing. Technical details”
Cause
The reason is: Microsoft enabled automated machine reviews to proactively detect the malicious collection of sensitive data in forms and temporary block those forms from collecting responses. More about it.
Solution
Ask your tenant global or security admin to go to the Microsoft Security Administration (Defender) Alerts:
If your list of alerts is too big – use filter by Policy: “Form blocked due to potential phishing attempt”.
To unblock the form or confirm it is phishing – admin should open the alert:
And then click “Review this form“. “Review the form” opens the page “https://forms.office.com/Pages/AdminPhishingReviewPage.aspx?id=” where is the form Id.
Then global/security admin can review the form and unblock it or confirm it is phishing:
There is a known problem in SharePoint (and Teams*) – complicated permissions system. Site owners/administrators provide access, site contributors upload documents and at any moment nobody knows – who has access to which files. As a result – sometimes sensitive documents become overshared (over-exposed). The biggest concern is when sites content is shared with “Everyone”. How do we know which content is shared with “Everyone”? Is there a report?
Obviously, only data owner knows who should have access to site documents, so we (SharePoint admins) do not fix permissions automatically (until there is a policy), but at least we can help site owners with reports and maybe initiate permissions review for “nasty” sites?
Below I’m sharing 4 possible solutions:
Solution #0 – Out-of-the-box EEEU report, but it comes with Premium license only
Solution #1 – also OotB report that comes with some 3-rd party tools
Solution #2 – PowerShell “Brute force” – free but require advanced skills and efforts
Solution #3 – Search-based – also free, and require less skills and efforts
(*) Microsoft with the introduction of Teams had to simplify permissions in SharePoint – since there should only be 3 types of access levels – owner, member and visitor. It was… in some ways, but in other ways it made things worse.
You are lucky if you can use 3-rd party tools (e.g. ShareGate, SysKit Point, AvePoint, Metalogix etc.), with the ability to get full permissions report. Though – if your m365 environment is not small – there might be a problem to get full permissions report for all tenant sites. Some tools allow you to get tenant-wide permissions report for specific Ids – this option should work better for large environments.
Still there might be another problem. Consider the following. When I say “shared with Everyone” – I actually mean at least 3 possible “everyone” system logins:
Everyone
Everyone except external users
All users
– those are system id’s, but what if there are other ids – e.g. migrated from on-prem or cloud-born custom security groups in tenant that also includes everyone or many users (e.g. dynamic security group that includes all org accounts)?
What if your Identity management operates security groups “All employee” or “Contractors” or “All licensed users”? Do you think these groups can be identified as “Everyone” groups? Do you think it’d be a good idea to check if content is shared with other large groups (not only system Everyone…)? Would you like to run permissions report separately for all groups that include “all” or “almost all” users?
Also, knowing that full reports heavily load the system, 3-rd party tools might by default limit “how deep report is” to the root site and lists/libraries, not including e.g. folders and items. So you might need go to settings and turn on “full deep” option Keep it in mind.
Obviously this option #1 is not free, as it requires licenses to be obtained. And still it worth to consider option 3.
Solution #2 (PowerShell “Brute force”)
You can get full permissions report per site or for entire tenant with PowerShell, which if free… The only you need is to write a script yourself or find existing one. Sounds easy?
Well, first problem is it takes a decent amount of time and competences to write such script. If if you find one – it would require some skills to adopt and run it. Frankly say, I have not seen so far scripts that were out-of-the-box ready to do that job. And it is not a good idea to run scripts you got from internet against your production environment until you understand it tested it and fully confident with.
Another possible problem – size of environment. The script I designed and use to get comprehensive permissions report might run hours against a good site – if I need full details on site/subsites, lists/libraries, folders and list items levels. So if you have less than 1000 sites – probably this approach can fly. But if your environment is 10K+ sites – it will take forever. So the approach might not work for large enterprise environments.
One might say – we can limit report with only root web permissions to get it faster. But this would be not accurate. And what is not accurate in the IT security – leads to even bigger risks. So, we need check permissions up to every item level deep, as even one file with sensitive info shared inappropriately can cause security issue. (Btw, 3-rd party tools usually by default limit reports to libraries level, so check reporting options…)
The other issue with this approach… Let say you got full permissions report… It would look like “resource -> group -> permissions”… How do you know for each group – what is the group in terms of membership?
Ok, if this solution is not easy to get working – what are other options?
Solution #3 (Search-based)
This solution is based on simple but clever idea: why do we need to iterate through all the tenant documents/items if all the content is already crawled by search? Search is also respect permissions. Can we just use search to get files shared with Everyone? Let us see.
What if we use some dummy user account with no specific permissions provided and no group memberships and try to search content on behalf of that account. The idea is if this user can see some data – then these data is open to everyone.
With this option we would use search query “*” and all 5 possible SharePoint entities – driveItem’,’listItem’,’list’,’drive’,’site’ to find everything that is shared with everyone. We’d pull results with paging (we’d use “from” option in a loop to pull all results). After we get all results – we’d select only unique site collections. But! We might have some problems here.
Problem #1. Again, for small environments or if there are not much “Open” sites – it would work. But for large enterprise environments the problem is the same as in “brute force”. Search would returns too many results – and it might take weeks (exact time is unpredictable) to get all of them. (Surely there are sites “legally” shared with everyone, public Office 365 group based sites, communication sites… So your search will be flooded with content from sites you already know are shared with all).
Problem #2. We are getting results with paging. But recently Microsoft started limiting number of returning results. E.g. your search request result might say like “there are 3659735 total hits” but after result number 1000 it just stops returning anything, even with paging.
Solution#3 Option #2 – loop sites
The idea is: why do we need to get all search results if even one result from a site would be enough to put the site to the list of “open” sites. In other words, we do not need all results from the site, we only need to know if there are any results from the site, at least one – so we know if the site is open for everyone or not.
So, consider the following approach:
You get list of all sites in tenant.
You run search request against each site in the loop (e.g. consider KQL option “Site: https://yourTenant.SharePoint.com/sites/YourSite”. If at least something found in the site – add the site to the “Open Sites” list. With this approach you will get list of sites shared with “Everyone…” in a predictable time.
Solution#3 Option #3 – exclude known “open” sites
There are sites “legally” shared with everyone – e.g. corporate portal, department communication sites, public teams, public Viva Engage communities etc. If it is know that these sites are public – you can exclude them from all sites list – so in the “Solution#3 Option #2 – loop sites” – you’d loop only through sites that are not supposed to be public. I know – percentage of “legally public” sites in tenant to all sites is a relatively small number, so should not significantly decrease elapsed time… but still.
Pros and cons of the Solution # 3
Pro 1: the only fast enough (at least predictable time to complete) and accurate enough to rely on solution.
Pro 2: There might be custom security groups intended to hold all or part of the enterprise (e.g. “All employee” or “all contractors”). If the enterprise comprises from several businesses or regions – it might be “All Business 1” or “All EMEA”… you got the idea. So you can tweak this search-based solution by adding your dummy account you are running search on behalf of to some of theses groups to find out if there are resources shared maybe not with everyone but with all “North America based” users or with “all employees”, which might make sense also.
Con 1 : crawling and indexing takes time, so search-based reports can miss recent changes in data and permissions
Con 2: this approach cannot be automated (since we need an interactive authentication). I.e. we need to run it manually every time.
Con 3: After we get all sites shared with everyone – we do not know – at what level permissions are broken and provided to everyone. It might be entire site or one file. It does not really help if you try to get all search results from the site. If you want to know what exactly is shared with everyone – on the site – run permissions report against this site (shortlist of sites).
Notes
Note 1: consider there are resources like “Styles Library” shared with everyone by default, especially on migrated sites
Note 2: this is a separate topic, but consider implementing/using sensitivity labels. At least you can start with high-sensitive sites. With sensitivity labels – site owners/member would know – what kind of site they are working on.
What’s next
Ok, we know list of SharePoint resources shared with everyone, but what would be the next step? Should we communicate to site owners – if so how to let site owners know that there are resources shared with Everyone… on their sites. To be continued…