Microsoft announced SharePoint Archive in 2023 and make the feature generally available in Apr 2024. Though there are good Microsoft’s articles on how to enable and configure SharePoint Archive, as well as some FAQ pages, there are still a lot of questions regarding behavior details, e.g.
what happens with Team content if the group-based site is Archived
is there an API or how do we archive/restore sites programmatically
would MS Graph Search API work for archived sites
I have just activated the feature and I’m planning updating this page with my gotchas and findings…
Why Archive?
If the site is not used, but you are not ready to delete it (or cannot delete it for compliance) – you can save money on storage by archiving site. – Regular SharePoint Storage = $0.2 per GB per month – Archived Storage = $0.05 per GB per month
Reactivation fee
How much is to restore a site from Archive? Microsoft says restore is free within 7 days. After 7 days it’ll cost $0.6 per GB. In the example below Microsoft charges me $1 to restore a simple OotB site with no documents:
Microsoft says “This amount is based on the retail price for reactivations. Your actual charges may be lower, and can be seen in Microsoft 365 Archive bill.”
Another confirmation is requested:
Reactivate site.
You’ll be charged a reactivation fee. This reactivation fee is based on the retail price for reactivations. Your actual charges may be lower, and can be seen in Microsoft 365 Archive bill.
The site will move back to Active sites page and start consuming active storage. This action can’t be cancelled once it starts. Estimated reactivation fee $1
Microsoft 365 groups is a key concept in today’s collaboration landscape that includes Microsoft Teams, Viva Engage, SharePoint etc. Access to resources is organized via groups. It is essential that every Microsoft 365 group has an owner (owners) so we have somebody to enforce Collaboration governance through.
Scenario
Let say you administer a large Microsoft 365 environment (e.g. ~100k+ users and/or ~50K+ sites) and after some years you have a lot of ownerless groups and sites (around 5k probably), and a lot of inactive groups and sites (maybe 15k). You are getting more and more ownerless groups – dozens each week. You are thinking of stopping bleeding and cleaning this up…
Out-of-the-box we have Microsoft 365 groups expiration policy and Microsoft 365 ownerless groups policy. You might also have some 3-rd party tools implemented – e.g. ShareGate, SysKit Point.
If you do not care – you might just activate both OotB Microsoft policies – via GUI – they are simple to activate. But once you activated policies – they will trigger thousands of emails. Now imagine a person is getting dozens of emails asking him/her to be an owner or to renew the group that probably he/she has no idea about… What will happen next? People will probably ignore these alerts. Then? Groups and sites will be automatically deleted. And then? Right, there will be a huge noise and many angry users and high-priority tickets and you will have to restore sites/teams and finally you’ll have to deal with all that mess manually.
So, what is the right way to clean-up a large Microsoft 365 environment from ownerless and inactive teams, groups sites? Not a trivial question, hah?
Solution
Disclaimer: I’m sharing here my personal opinion with no obligations or warranty etc., so you’d dig into all the technologies used and based on your particular situation build your own plan. But my personal opinion is based on my 15+ years experience with SharePoint, including really large environments.
Note: It is always a good idea to discuss your plans with you org’s communication team and helpdesk/service-desk to adjust clean-up activities with other initiatives and let other people be prepared.
High-level steps for group-based Sites:
consider implementing Minimum 2 owners per group policy to stop bleeding. Currently Microsoft 365 does not have such functionality, so consider 3-rd party tool like SysKit Point or custom PowerShell script that sends notifications
apply this policy to groups where you already have 2+ owners – it’ll be safe
apply this policy to all other groups by chanks
consider custom PowerShell clean-up, e.g. you can simply delete groups with no owners and no members and/or inactive groups with no content and/or groups that are inactive for a long time (this must be aligned with business and legal)
avoid scoping down this policy via people (security groups)
implement it for all groups all users with 6-7 weeks and custom e-mail template
implement Microsoft groups expiration policy in “Clean-Up” configuration… again, there are a few different strategies – see this article
change Microsoft Ownerless groups policy configuration to a “Permanent” mode configuration set
(or) change Microsoft 365 groups expiration policy with a “Permanent” mode configuration
(or) develop and implement custom staged decommissioning process – kind of “last chance” set of scripts to discontinue groups that are still ownerless after all efforts above. Staged means we do not just delete these groups, but e.g. we can – rename ownerless groups – convert groups from public to private – set teams to archived mode – exclude sites from copilot search with “Restricted SharePoint Search” etc. – set site to no-access mode – remove members from the group – and finally delete the group with connected team team and site I have a separate article on custom staged decommissioning process
Note: There will always be ownerless groups in large environment. We have to live with it. So all steps above – think of it as a processes – we’d need to do it on regular basis.
All above was mostly about group-based sites (as we have OotB Microsoft policies for groups), but we probably have the same problem (or even worth) with standalone sites (that would be a separate topic).
What is archiving SharePoint sites and why we’d need it?
Disclaimer: Archival that was announced at Microsoft Inspire 2023 (Introducing Microsoft 365 Backup and Microsoft 365 Archive) is not what we are discussing here. Though it might be considered as an option (as archived sites are still visible for admins but not visible for users), MS SharePoint Archive require additional licensing.
Scenario
You are in the process of cleaning-up large Microsoft 365 environment. You need to delete SharePoint sites (e.g. due to inactivity) but you cannot get confirmation from site owners (e.g. sites or groups are ownerless).
Deleted sites could be restored within 93 days of deletion if somebody rise a hand, but there is still a risk of possible loosing of important information, e.g. in case site is needed one a year. So you need to do clean-up but at the same time you want to decrease risks of loosing information.
So, you might want to do something with sites to engage users to volunteer to be site owner if they want to keep this site – e.g. prevents using the site the regular way and let users know that the site will be deleted etc., but do not actually delete site until it will be fully clear that site is not needed for anyone and can be safely deleted.
Let us call it “Staging” period. Depending on your org culture/rules/licensing etc. it might be 6 months, or 1 year or 5 years or more.
Approach options
generally, the options are (random order):
Set site to Read-Only mode
Set site to No-Access mode
Convert group from Public to Private
Remove access to the site (remove users from group)
Rename the site
Put a banner on a top bar with a message
Message to Teams or Yammer chat
Send e-mail to site members
Implement a Microsoft 365 ownerless groups policy
You might choose to set sites to read-only mode or even no-access mode. If so – users that are still need this site are loosing ability to work with site, but site is not deleted. Consider archiving as kind of scream-test phase before actual sites deletion.
If a user who needs this site would scream (rise a ticket to restore site) – you can trigger processes of a) finding new owner for the site b) excluding the site from clean-up process c) actual restoring site to normal mode
There are some options to setup a site to Read-Only or NoAccess mode. Here is the PowerShell command:
$siteurl = "https://contoso.sharepoint.com/teams/Team-SO-B"
Get-PnPTenantSite -Identity $siteurl | ft -a Url, LockState
Set-PnPTenantSite -Identity $siteurl -LockState ReadOnly
Get-PnPTenantSite -Identity $siteurl | ft -a Url, LockState
Set-PnPTenantSite -Identity $siteurl -LockState NoAccess
Get-PnPTenantSite -Identity $siteurl | ft -a Url, LockState
Set-PnPTenantSite -Identity $siteurl -LockState Unlock
The problem is what if the site is teams-connected or yammer-connected or just group-based. Here are some test results:
Services SharePoint site is connected to/Site State
Read-Only
NoAccess
Outlook only
N/A
N/A
SharePoint and Outlook
Outlook emails: OK Outlook files: read-only experience; No options to upload or create document; Documents are open in read-only mode. “The file couldn`t be saved to group” error message when trying to save file to a group library.
Outlook emails: OK Outlook files: empty screen; No error messages; Documents are not visible; “The file couldn`t be saved to group” error message when trying to save file to a group library.
SharePoint and Yammer
SharePoint, Teams and Outlook
Teams chats: OK Teams files: documents are open as read-only; No options to upload or create a new document SharePoint: “This site is read-only at the administrator’s request.”
Teams chats: OK Teams files: “403 FORBIDDEN” error message SharePoint: “ This site can’t be reached The webpage at https://contoso.sharepoint.com/teams/Team-STO-B might be temporarily down or it may have moved permanently to a new web address. ERR_INVALID_RESPONSE”
So you can see – behavior is inconsistent – users can still chat in Teams and Yammer and consume SharePoint content (in case the site in read-only) or get error messages or not very meaningful results (in case the site is in NoAccess mode) – so it would be not clear for users that the site is gong to be decommissioned.
Sometimes, mostly during PoC or testing policies like retention policy or lifecycle policy you would need some documents created and updated weeks, months or even years ago.
But if you create or upload a document in SharePoint library – it will be just a regular new document. So, how to get old documents in the new environment?
I see two options:
Sync with OneDrive If you sync a library with your local folder (done Microsoft by OneDrive desktop app) and put some old document in your synced folder – the doc will be synchronized back to SharePoint library with Created and Modified properties preserved.
Make the document older with PowerShell With “Set-PnPListItem” PowerShell command you can update not only such properties like Title, but also “Created By”, “Modified By” and even date and time document was created and modified via “Created” and “Modified”. Optionally you can play with document history with “-UpdateType” parameter. UpdateType possible values are:
Update: Sets field values and creates a new version if versioning is enabled for the list
SystemUpdate: Sets field values and does not create a new version. Any events on the list will trigger.
UpdateOverwriteVersion: Sets field values and does not create a new version. No events on the list will trigger
Adaptive scopes are good, but what if both policies are implemented? Which one wins? The scenario for two policies might be: static retention policy is implemented as default retention policy for all sites, and if site require different retention or deletion – it should fall under one of the adaptive scopes and an adaptive retention policy will be applied.
where you can use objects: “Site Url”, “Site Name” and “Refinable String 0″..”Refinable String 99”. Conditions would be “is equal to”, “is not equal to”, “starts with” and “not starts with”. Or you can select “Advanced query builder” and enter KQL query.
Advanced query builder
Advanced query builder allows us to use more site properties then “Site Url”, “Site Name” and “Refinable Strings” and more conditions than “is (not) equal to” and “(not) starts with”.
E.g. we can use “Title”, “Created”, “Modified” site properties and “=”,”:”,”<“, “>”, “<=”, “>=” conditions.
Working queries examples:
created>=2022-07-21
modified>1/31/2023
created>12/31/2021 AND modified>=7/31/2022
created<=2020-11-15 OR modified>2023-02-06 (?)
created<=2020-1-15 OR modified>2023-01-31 (?)
created<=11/15/2020 OR modified>1/31/2023
title:test
SiteTitle:test
RefinableString09:Test*
RefinableString09<>Test
RefinableString09=Birding AND RefinableString08<>Included
Not working queries examples:
site:https://contoso.sharepoint.com/sites/test*
RefinableString11 = Birds # (do not use spaces in advanced query)
Path:https://contoso-my.sharepoint.com
Template:STS
Template:"SITEPAGEPUBLISHING#0"
Template:SITEPAGEPUBLISHING*
? RefinableString09<>Birding AND RefinableString08:Official
modified>31/1/2023 (should be like modified>2023-01-31
)
Query against custom site property (aka property bag value)
You can create custom site property and assign value to the property with Set-PnPAdaptiveScopeProperty or Set-PnPPropertyBagValue. Property must be with “Indexed” parameter. Once the property is set up, m365 search crawls site and creates crawled property. Then you map crawled property to some pre-created refinable string managed property. You can assign alias to this managed property.
In my test scenario I used RefinableString09 with alias SiteCustomSubject.
Site property value
Query
result
Birding
RefinableString09:Bird
does not work
Birding
SiteCustomSubject:Bird
does not work
Birding
RefinableString09:Bird*
works
Birding
SiteCustomSubject:Bird*
does not work
Birding
RefinableString09:Birding
works
Birding
SiteCustomSubject:Birding
does not work
Birding
RefinableString09:Birding*
works
Birding
RefinableString09=Birding
works
Birding
RefinableString09=Bird
does not work
Birding
RefinableString09=Bird*
does not work
Birding
SiteCustomSubject=Birding
does not work
RefinableString09<>Birding
works
RefinableString09=Birding AND RefinableString08<>Included
works
Query against multi-value property.
Site property value
Query
result
TestA TestB
RefinableString09:TestA
works
TestA TestB
RefinableString09 = ‘TestA TestB’
does not work
TestA TestB
??? RefinableString09=’Test10 Test5′
does not work
TestA TestB
RefinableString09:TestB
?
TestA,TestB
RefinableString09:Test*
works
TestA,TestB
RefinableString09=Test*
does not work
TestA,TestB
RefinableString09:Test
does not work
TestA,TestB TestA;TestB TestB TestA TestA TestB
RefinableString09:TestB
works
TestA, TestB TestB,TestA TestA TestB
RefinableString09=TestA
does not work
TestA,TestB
(basic) RefinableString09 starts with test
works
Some more findings
Modify adaptive scope
If you need to modify adaptive scope – you’d better delete it and create a new one. The reason – if you want to validate what sites are included in scope with GUI – via button “Scope details” – you want to see only sites that are in scope, but that’s not the case when you modify the scope, because if you modify the scope – you’d see sites that are not in scope with “Removed” status.
Alternatively you can use filter to filter out removed from scope sites.
what else?
What is the takeaway from this for SharePoint administrators? We would be asked to configure SharePoint the way compliance…
Microsoft recently implemented “Adaptive” retention policies. At step 2 of “Create retention policy” you’ll be asked “Choose the type of retention policy to create”: “A policy can be adaptive or static. Advantage of an adaptive policy will automatically update where it’s applied based on attributes or properties you’ll define. A static policy is applied to content in a fixed set of locations and must be manually updated if those locations change.”
And if you selected “Adaptive” – on the next step you will need to provide the adaptive scope (so at this moment you should already have created your adaptive scopes):
So, let us create your adaptive scopes. What type of scope do you want to create? SharePoint sites…
And then you’ll have nothing more then set of conditions:
where you can use objects: “Site Url”, “Site Name” and “Refinable String 0″..”Refinable String 99”. Conditions would be “is equal to”, “is not equal to”, “starts with” and “not starts with”. Or you can select “Advanced query builder” and enter KQL query.
Advanced query builder for SharePoint Adaptive Scope
Microsoft recently implemented “Adaptive Scopes” for retention policies”. Before that we had to use “static” scopes only, i.e. we could apply the policy to all sites or to specific selected sites we had to choose manually. With adaptive scopes we can use rules like “This adaptive scope must include all sites with Site Url starts with A or Site name starts with A” and so on. And then we’d apply the retention policy to all the sites in this adaptive scope. This is nice, but actually site Url and site name does not have much to do with sites categorization for the retention policies. How can we implement sites classification to apply different policies to different sites categories? Luckily, when you configure adaptive scopes, you can use Refinable Strings, and refinable strings is something you can configure to have values from custom site properties. So finally we can assign specific value to custom site property and the site would fall under this or that retention policy based on the value we dynamically assigned to the site.
Wait until search crawler picked up you site property. Now you have a crawled property.
Search schema mapping
As you know, Refinable Strings are just pre-created by Microsoft refinable managed properties. So you can select one that is not used(*) and map it to crawled property. You can assign alias so you could easily identify what is the RefinableString55 about (but aliases do not work in advanced query).
(*) Notes
select one that is not used select one that is not used is an important, bacause if you select refinable string that is already taken at the some site level – there is a conflict. So before configuring pre-created refinable properties at tenant level – I’d recommend to get report on managed properties taken at sites levels. It would be good idea if you arrange with sites owners on properties ranges (e.g. from 00 to 99 – reserved for tenant use, from 100 to 199 – available at sites level search customizations). And/or you can – after getting report on managed properties taken at sites levels – reserve all unused managed properties by assigning aliases e.g. “this-property-55-is-reserved-by-admin-for-tenant-level-config”.
site custom script If site custom scripts are enabled (DenyAddAndCustomizePages = false), then site collection admin can change site properties. So if you do not want the property being altered at site level – ensure that noscript site property is enabled (DenyAddAndCustomizePages equals true)
If site custom scripts are disabled (DenyAddAndCustomizePages = true), then an admin must enable them before using “Set-PnPPropertyBagValue” cmdlet (then disable again). “Set-PnPAdaptiveScopeProperty” cmdlet handles this automatically.