How Umbraco prepares content requests
Is started in UmbracoRouteValueTransformer
where it gets the HttpContext
and RouteValueDictionary
from the netcore framework:
What it does:
It ensures Umbraco is ready, and the request is a document request.
Ensures there's content in the published cache, if there isn't it routes to the RenderNoContentController
which displays the no content page you see when running a fresh install.
Creates a published request builder.
Routes the request with the request builder using the PublishedRouter.RouteRequestAsync(…)
.
This will handle redirects, find domain, template, published content and so on.
Build the final IPublishedRequest
.
Sets the routed request in the Umbraco context, so it will be available to the controller.
Create the route values with the UmbracoRouteValuesFactory
.
This is what actually routes your request to the correct controller and action, and allows you to hijack routes.
Set the route values to the http context.
Handles posted form data.
Returns the route values to netcore so it routes your request correctly.
When the RouteRequestAsync
method is invoked on the PublishedRouter
it will:
FindDomain().
Handle redirects.
Set culture.
Find the published content.
Only if it doesn't exist, allowing you to handle it in a custom way with a custom router handler.
Find the template.
Set the culture (again, in case it was changed).
Publish RoutingRequestNotification
.
Handle redirects and missing content.
Initialize a few internal stuff.
We will discuss a few of these steps below.
The FindDomain method looks for a domain matching the request Uri
Using a greedy match: “domain.com/foo” takes over “domain.com”.
Sets published content request’s domain.
If a domain was found.
Sets published content request’s culture accordingly.
Computes domain Uri based upon the current request ("domain.com" for "http://domain.com" or "https://domain.com").
Else.
Sets published content request’s culture by default (first language, else system).
When finding published content the PublishedRouter
will first check if the PublishedRequestBuilder
already has content, if it doesn't the content finders will kick in. There a many different types of content finders, such as find by url, by id path, and more. If none of the content finders manages to find any content, the request will be set as 404, and the ContentLastChanceFinder
will run, this will try to find a page to handle a 404, if it can't find one, the ugly 404 will be used.
You can also implement your own content finders and last chance finder, for more information, see IContentFinder
The PublishedRouter
will also follow any internal redirects there might be, it is however limited, as to not spiral out of control if there is an infite loop of redirects.
Once the content has been found, the PublishedRouter
moves on to finding the template.
First off it checks if any content was found, if it wasn't it sets the template to null, since there can't be a template without content.
Next it checks to see if there is an alternative template which should be used. An alternative template will be used if the router can find a value with the key "altTemplate", in either the querystring, form, or cookie, and there is content found by the contentfinders, so not the 404 page, or it's an internal redirect and the web routing setting has InternalRedirectPreservesTemplate
.
If no alternative template is found the router will get the template with the file service, using the ID specified on the published content, and then assign the template to the request.
If an alternative template is specified, the router will check if it's an allowed template for the content, if the template is not allowed on that specific piece of content it will revert to using the default template. If the template is allowed it will then use the file service to get the specified alternative template and assign the template to the request.
The router will pick up the redirect and redirect. There is no need to write your own redirects:
In case the router can't find a template, it will try and verify if there's route hijacking in place, if there is, it will run the hijacked route. If route hijacking is not in place, the router will set the content to null, and run through the routing of the request again, in order for the last chance finder to find a 404.
The followed method is called on the "PublishedContentRequest.PrepareRequest()" method: FindPublishedContentAndTemplate()
. We discuss shortly what this method is doing:
FindPublishedContent ()
Handles redirects
HandlePublishedContent()
FindTemplate()
FollowExternalRedirect()
HandleWildcardDomains()
No content?
Run the LastChanceFinder
Is an IContentFinder, resolved by ContentLastChanceFinderResolver
By default, is null (= ugly 404)
Follow internal redirects
Take care of infinite loops
Ensure user has access to published content
Else redirect to login or access denied published content
Loop while there is no content
Take care of infinite loops
Use altTemplate if
Initial content
Internal redirect content, and InternalRedirectPreservesTemplate is true
No alternate template?
Use the current template if one has already been selected
Else use the template specified for the content, if any
Alternate template?
Use the alternate template, if any
Else use what’s already there: a template, else none
Alternate template is used only if displaying the intended content
Except for internal redirects
If you enable InternalRedirectPreservesTemplate
Which is false by default
Alternate template replaces whatever template the finder might have set
ContentFinderByNiceUrlAndTemplate
/path/to/page/template1?altTemplate=template2 template2
Alternate template does not falls back to the specified template for the content
/path/to/page?altTemplate=missing no template
Even if the page has a template
But preserves whatever template the finder might have set
/path/to/page/template1?altTemplate=missing template1
Content.GetPropertyValue("umbracoRedirect")
If it’s there, sets the published content request to redirect to the content
Will trigger an external (browser) redirect
Finds the deepest wildcard domain between
Domain root (or top)
Request’s published content
If found, updates the request’s culture accordingly
This implements separation between hostnames and cultures
Information about creating your own content finders
To create a custom content finder, with custom logic to find an Umbraco document based on a request, implement the IContentFinder interface:
and use either an Umbraco builder extension, or a composer to add it to it to the ContentFindersCollection
.
Umbraco runs all content finders in the collection 'in order', until one of the IContentFinders returns true. Once this occurs, the request is then handled by that finder, and no further IContentFinders are executed. Therefore the order in which ContentFinders are added to the ContentFinderCollection is important.
The ContentFinder can set the PublishedContent item for the request, or template or even execute a redirect.
This IContentFinders will find a document with id 1234, when the Url begins with /woot.
You either use an extension on the Umbraco builder or, a composer to access the ContentFinderCollection
to add and remove specific ContentFinders
First create the extension method:
Then invoke it in ConfigureServices
in the Startup.cs
file:
To set your own 404 finder create an IContentLastChanceFinder and set it as the ContentLastChanceFinder. (perhaps you have a multilingual site and need to find the appropriate 404 page in the correct language)
A IContentLastChanceFinder
will always return a 404 status code. This example creates a new implementation of the IContentLastChanceFinder
and gets the 404 page for the current language of the request.
You can configure Umbraco to use your own implementation in the ConfigureServices
method of the Startup
class in Startup.cs
:
How the Umbraco inbound request pipeline works
The inbound process is triggered by UmbracoRouteValueTransformer
and then handled with the Published router. The published content request preparation process kicks in and creates a PublishedRequestBuilder
which will be used to create a PublishedContentRequest
.
The PublishedContentRequest
object represents the request which Umbraco must handle. It contains everything that will be needed to render it. All this occurs when the Umbraco modules knows that an incoming request maps to a document that can be rendered.
There are 3 important properties, which contains all the information to find a node:
Domain is a DomainAndUri object that is a standard Domain plus the fully qualified uri. For example, the Domain may contain "example.com" whereas the Uri will be fully qualified for example "https://example.com/".
It contains the content to render:
Contains template information:
The published request is created using the PublishedRequestBuilder
, which implements IPublishedRequestBuilder
. It's only in this builder that it's possible to set values, such as domain, culture, published content, redirects, and so on.
You can subscribe to the 'routing request' notification, which is published right after the PublishedRequestBuilder
has been prepared, but before the request is built, and processed. Here you can modify anything in the request before it is built and processed! For example content, template, etc:
What the Umbraco Request Pipeline is
This section describes what the Umbraco Request Pipeline is. It explains how Umbraco matches a document to a given request and how it generates a URL for a document.
The request pipeline is the process of building up the URL for a node and resolving a request to a specified node. It ensures that the right content is sent back.
The pipeline works bidirectional: inbound and outbound.
Outbound is the process of building up a URL for a requested node. Inbound is every request received by the web server and handled by Umbraco.
This section will describe the components that you can use to modify Umbraco's request pipeline: IContentFinder & IUrlProvider
The outbound pipeline consists out of the following steps:
To explain things we will use the following content tree:
When the URL is constructed, Umbraco will convert every node in the tree into a segment. Each published Content item has a corresponding url segment.
In our example "Our Products" will become "our-products" and "Swibble" will become "swibble".
The segments are created by the "Url Segment provider"
The DI container of an Umbraco implementation contains a collection of UrlSegmentProviders
. This collection is populated during Umbraco boot up. Umbraco ships with a 'DefaultUrlSegmentProvider' - but custom implementations can be added to the collection.
When the GetUrlSegment
extension method is called for a content item + culture combination, each registered IUrlSegmentProvider
in the collection is executed in 'collection order'. This continues until a particular UrlSegmentProvider
returns a segment value for the content, and no further UrlSegmentProviders
in the collection will be executed. If no segment is returned by any provider in the collection a DefaultUrlSegmentProvider
will be used to create a segment. This ensures that a segment is always created, like when a default provider is removed from a collection without a new one being added.
To create a new Url Segment Provider, implement the following interface:
Note each 'culture' variation can have a different Url Segment!
The returned string will be the Url Segment for this node. Any string value can be returned here but it cannot contain the URL segment separator character /
. This would create additional "segments" - something like 5678/swibble
is not allowed.
For the segment of a 'product page', add its unique SKU / product ref to the existing Url segment:
The returned string becomes the native Url segment - there is no need for any Url rewriting.
For our "swibble" product in our example content tree, the ProductPageUrlSegmentProvider
would return a segment "swibble--123xyz". In this case, 123xyz is the unique product sku/reference for the swibble product.
Register the custom UrlSegmentProvider with Umbraco, either using a composer or an extension method on the IUmbracoBuilder
:
The Default Url Segment provider builds its segments by looking for one of the below values, checked in this order:
A property with alias umbracoUrlName on the node. (this is a convention led way of giving editors control of the segment name - with variants - this can vary by culture).
The 'name' of the content item e.g. content.Name
.
The Umbraco string extension ToUrlSegment()
is used to produce a clean 'Url safe' segment.
To create a path, the pipeline will use the segments of each node to produce a path.
If we look at our example, the "swibble" node will receive the path: "/our-products/swibble". If we take the ProductPageUrlSegmentProvider
from above, the path would become: "/our-products/swibble-123xyz".
But, what if there are multiple websites in a single Umbraco Implementation? in this multi-site scenario then an (internal) path to a node such as "/our-products/swibble-123xyz" could belong to any of the sites, or match multiple nodes in multiple sites. In this scenario additional sites will have their internal path prefixed by the node id of their root node. Any content node with a hostname defines a “new root” for paths.
Paths can be cached, what comes next cannot (http vs https, current request…).
Domain without path e.g. "www.site.com" will become "1234/path/to/page"
Domain with path e.g. "www.site.com/dk" will produce "1234/dk/path/to/page" as path
No domain specified: "/path/to/page"
Unless HideTopLevelNodeFromPath config is true, then the path becomes "/to/page"
The Url of a node consists of a complete URI: the Schema, Domain name, (port) and the path.
In our example the "swibble" node could have the following URL: "http://example.com/our-products/swibble"
Generating this url is handled by the Url Provider. The Url Provider is called whenever a request is made in code for a Url e.g.:
The DI container of an Umbraco implementation contains a collection of UrlProviders
this collection is populated during Umbraco boot up. Umbraco ships with a DefaultUrlProvider
- but custom implementations can be added to the collection. When .Url is called each IUrlProvider
registered in the collection is executed in 'collection order' until a particular IUrlProvider
returns a value. (and no further IUrlProviders
in the collection will be executed.)
Umbraco ships with a DefaultUrlProvider
, which provides the implementation for the out of the box mapping of the structure of the content tree to the url.
If the current domain matches a root domain of the target content.
Return a relative Url.
Else must return an absolute Url.
If the target content has only one root domain.
Use that domain to build the absolute Url.
If the target content has more than one root domain.
Figure out which one to use.
To build the absolute Url.
Complete the absolute Url with scheme (http vs https).
If the domain contains a scheme use it.
Else use the current request’s scheme.
If "addTrailingSlash" is true, then add a slash.
Then add the virtual directory.
If the URL provider encounters collisions when generating content URLs, it will always select the first available node and assign the URL to this one. The remaining nodes will be marked as colliding and will not have a URL generated. Fetching the URL of a node with a collision URL will result in an error string including the node ID (#err-1094) since this node does not currently have an active URL. This can happen if an umbracoUrlName property is being used to override the generated URL of a node, or in some cases when having multiple root nodes without hostnames assigned.
This means publishing an unpublished node with a conflicting URL, might change the active node being rendered on that specific URL in cases where the published node should now take priority according to sort order in the tree!
Create a custom Url Provider by implementing IUrlProvider
interface:
The url returned in the 'UrlInfo' object by GetUrl can be completely custom.
If implementing a custom Url Provider, consider following things:
Cache things.
Be sure to know how to handle schema's (http vs https) and hostnames.
Inbound might require rewriting.
If there is only a small change to the logic around Url generation, then a smart way to create a custom Url Provider is to inherit from the DefaultUrlProvider and override the GetUrl() virtual method.
Add /fish on the end of every url. It's important to note here that since we're changing the outbound url, but not how we handle urls inbound, this will break the routing. In order to make the routing work again you have to implement a custom content finder, see IContentFinder for more information on how to do that.
Register the custom UrlProvider with Umbraco:
The GetOtherUrls method is only used in the Umbraco Backoffice to provide a list to editors of other Urls which also map to the node.
For example, let's consider a convention-led umbracoUrlAlias
property that enables editors to specify a comma delimited list of alternative urls for the node. It has a corresponding AliasUrlProvider
registered in the UrlProviderCollecton
to display this list to the Editor in the backoffice Info Content app for a node.
Specifies the type of urls that the url provider should produce, eg. absolute vs. relative Urls. Auto is the default
These are the different modes:
Default setting can be changed in the Umbraco:CMS:WebRouting section of appsettings.json
:
See WebRouting config reference documentation for more information on routing settings.
The ISiteDomainMapper
implementation is used in the IUrlProvider
and filters a list of DomainAndUri
to pick one that best matches the current request.
Create a custom SiteDomainMapper by implementing ISiteDomainMapper
The MapDomain methods will receive the Current Uri of the request, and custom logic can be implemented to decide upon the preferred domain to use for a site in the context of that request. The SiteDomainMapper's role is to get the current Uri and all eligible domains, and only return one domain which is then used by the UrlProvider to create the Url.
Only a single ISiteDomainMapper
can be registered with Umbraco.
Register the custom ISiteDomainMapper
with Umbraco using the SetSiteDomainHelper
extension method
Umbraco ships with a default SiteDomainMapper
. This has some useful functionality for grouping sets of domains together. With Umbraco Cloud, or another Umbraco development environment scenario, there maybe be multiple domains setup for a site 'live, 'staging', 'testing' or a seperate domain to access the backoffice. Each domain will be setup as a 'Culture and Hostname' inside Umbraco. By default editors will see the full list of possible Urls for each of their content items on each domain, which can be confusing. If the additional urls aren't present in Culture and Hostnames, then when testing the front-end of the site on a 'staging' url, will result in navigation links taking you to the registered domain!
What the editor sees without any SiteDomainMapper, visiting the backoffice url:
Which is 'noise' and can lead to confusion: accidentally clicking the staging url, which is likely to be served from a different environment / different database etc may display the wrong content...
To avoid this problem, use the default SiteDomainMapper's AddSite method to group Urls together.
Since the SiteDomainMapper is registered in the DI, we can't consume it directly from a composer, so first create a component which adds the sites in the initialize method:
Then add the component with a composer:
Now if an editor visits the backoffice via the staging url they will only see domains for the staging url:
Now if an editor visits the backoffice via the backoffice url they will only see domains for the backoffice url and the production url:
NB: it's not a 1-1 mapping, but a grouping. Multiple Urls can be added to a group. Think multilingual production and staging variations, and in the example above, if an editor logged in to the backoffice via the production url, eg umbraco-v8.localtest.me/umbraco - they would see the umbraco-v8-backoffice.localtest.me domain listed.
The SiteDomainMapper contains a 'BindSites' method that enables different site groupings to be bound together:
Visiting the backoffice now via umbraco-v8-backoffice.localtest.me/umbraco would list all the 'backoffice' grouped domains AND all the 'staging' grouped domains.
Node | Segment | Internal Path |
---|---|---|
Our Values
our-values
/our-values
Our Products
our-products
/our-products
Swibble
swibble-123xyz
/our-products/swibble-123xyz
Dibble
dibble-456abc
/our-products/dibble-456abc
Another Site
another-site
9676/
Their Values
their-values
9676/their-values