Search This Blog

Monday, November 18, 2013

Manage lists and libraries with many items

Overview of lists and libraries with many items
SharePoint 2010 supports very large lists and libraries. The following table summarizes the specifications and limits for lists and libraries that contain many items.
Number of items in a list or library
50 million
Size of an individual file item or item attachment
2 Gigabytes
As you can see, you can store a lot of data and files in lists and libraries. But if a query or database operation exceeds a resource throttle or limit, you may be blocked from doing this operation because accessing that many items could adversely affect other users of the site. Fortunately, there are a set of straightforward strategies and basic features you can use to achieve the results you want to prevent you from exceeding these throttles and limits. All users can circumvent the limit during a dedicated time window, and with appropriate permission you can also use a higher limit through code.
Why is 5,000 such a magic number?
To minimize database contention, SQL Server often uses row-level locking as a strategy to ensure accurate updates without adversely impacting other users who are accessing other rows. However, if a read or write database operation, such as a query, causes more than 5,000 rows to be locked at once, then it's more efficient for SQL Server to temporarily escalate the lock to the entire table until the database operation is completed. Note that the actual number is not always 5,000, and can vary depending on your site, the amount of activity in the database, and your site's configuration. When this lock escalation occurs, it prevents other users from accessing the table. If this happens too often, then users will experience a degradation of system performance. Therefore, thresholds and limits are essential to help minimize the impact of resource-intensive database operations and balance the needs of all users.
What happens behind the scenes when you access many items in a list or library?
The following diagram summarizes the key points about what happens behind the scenes when you access many items in a list or library.

Callout 1 Information workers can use appropriate views, styles, and page limits to speed up the display of data on the page.
Callout 2 Filtered views with column indexes (and other operations) create database queries that identify a subset of columns and rows and return this subset to your computer.
Callout 3 Farm administrators can specify dedicated time windows for all users to do unlimited operations during off-peak hours.
Callout 4 Privileged developers can use object model overrides to temporarily increase thresholds and limits for custom applications.
Callout 5 Thresholds and limits help throttle operations and balance resources for many simultaneous users.
Callout 6 List or library data in a site collection is stored in a SQL Server database table, which uses queries, indexes and locks to maintain overall performance, sharing, and accuracy.

Resource throttles and limits
The following table summarizes information about resource throttles and limits that you need to be aware of. These throttles and limits are set on the Resource Throttling page in Central Administration. Contact your administrator for specific limits and requests for your site.
 NOTE   To assist central administration, the computer administrator of the front-end computer and the central administrator site owner accessing a list or library with many items are not subject to the following resource throttles and limits.
List View Threshold
Specifies the maximum number of list or library items that a database operation, such as a query, can process at one time. Operations that exceed this limit are blocked.
To give you time to make alternative plans, SharePoint 2010 warns you on the List Settings page when your list has exceeded 3,000 items. The warning contains a help link to this topic.
Unique permissions limit
Specifies the maximum number of unique permissions allowed for a list or library.
Every time you break the inheritance of permissions for an item or folder, it is counted as 1 unique permission toward this limit. If you try to add an item that would lead to exceeding this limit, you are prevented from doing so.
Row size limit
Specifies the maximum number of table rows internal to the database used for a list or library item. To accommodate wide lists with many columns, each item is wrapped over several internal table rows, up to 6 rows and up to a total of 8,000 bytes (excluding attachments).
For example, if you have a list with many small columns, one that contains hundreds of Yes/No columns, then you could exceed this limit, in which case you would not be able to add more Yes/No columns to the list, but you still may be allowed to add columns of a different type.
 NOTE   Farm administrators can only set this limit by using the object model, not through the user interface.
List View Lookup Threshold
Specifies the maximum number of join operations, such as those based on lookup, Person/Group, or workflow status columns.
If the query uses more than eight columns, the operation is blocked. However, it is possible to programmatically select which columns to use by using maximal view, which can be set through the object model.
List View Threshold size for auditors and administrators
Specifies the maximum number of list or library items that a database operation, such as a query, can process at one time when performed by an auditor or administrator with appropriate permissions. This setting works in conjunction with Allow Object Model Override.
Allow Object Model Override
Specifies whether or not developers can perform database operations, such as queries, that request an override of the List View Threshold to the higher limit specified by the List View Threshold for auditors and administrators. A farm administrator must enable an object model override, and then developers with appropriate permission may then programmatically request that their query use the higher List View Threshold to take advantage of it.
Daily time window
Specifies a time period during which resource thresholds and limits are ignored. A farm administrator can configure a time window during "off-peak" hours in 15 minute increments and up to 24 hours, such as, 6:00 PM to 10:00 PM or 1:30 AM to 5:15 AM.
 NOTE   A database operation or query started within the daily time window continues until completion (or failure) even if it doesn't finish within the specified time window.
Ways to manage lists and libraries with many items
The following sections provide recommendations, techniques, and tips for ensuring that you and other users can quickly access many items in a list or library and help to keep your site operating smoothly.
Creating SharePoint indexed columns
To help improve the performance of a large list or library, you can index up to 20 columns. In general, an index on a column enables you to quickly find the rows you want based on the values in that column, even when working with millions of items. When you combine indexes with filtered views, you can quickly retrieve the items you want.
It is important to consider the following when you create and use indexed columns. Each additional column index consumes extra resources in the database and adds some overhead to every operation to maintain the index. Therefore, you should add indexes only to columns that will be used actively for filtering in views on the list or library. It's a good idea to look at all the views and see which columns are used most commonly across different views for filtering to help choose the right columns to be indexed. Note that any column that you define to have a unique value requires an index.
The following table summarizes what columns can and cannot be indexed.
Single line of text
Multiple lines of text
Choice (single value)
Choice (multi-valued)
Hyperlink or Picture
Date and Time
Custom Columns
Lookup (single value)
 NOTE   Indexing the column in the other list or library does not improve performance of the lookup operation.
Lookup (multi-valued)
Person or Group (single value)
Person or Group (multi-valued)
External data
Managed Metadata
For steps on how to create an indexed column, see the section Create a simple or compound index in this topic.
Creating filtered views based on column indexes
For a view to quickly filter through a large number of items, the first column that you specify in the filter must be indexed. Other columns you specify in the view filter may or may not be indexed, but the view does not use those indexes. You also need to make sure that the first column of the filter does not return more items than the List View Threshold, even if the final result of the filtered view returns less than the List View Threshold. If the first column of the filter returns more items than the List View Threshold, you can use a filter with two or more columns When you define a filtered view that uses two or more columns, using an AND operator will usually limit the total number of items returned. But even in this case, you still need to specify as the first column in the filter the column that most likely returns the lesser amount of data. Note that using an OR filter almost always increases the number of items returned and won't be effective under these circumstances.
 IMPORTANT    If you are filtering a list by an indexed field, check your recycle bin for deleted items. Items in the recycle bin aren't actually deleted from the backend database until they are removed from the recycle bin. If the total number of filtered items in the list and in the recycle bin is greater than the List Threshold limit, then you may get an incomplete set of results, or none at all.
 NOTE   If you are a developer, you can programmatically use a compound index on the first column. A compound index is based on two columns, primary and secondary, and can use either column to access the data quickly.
The following are some suggestions for typical views that would work well with indexed columns:
Recently changed items
Modified column
To view only items that have changed in the past week, apply the filterModified is greater than  [Today]-7
New items
Created column
To create a view of only those items that were added in the past week, apply the filter Created is greater than [Today]-7
My items
Created Bycolumn
To create a view of only those items that you added, apply the filter Created By is equal to [Me]
Items due today
Due Datecolumn (created by you in a list or library)
To create a view of only those items with a due date of today, apply the filterDue Date is equal to [Today]
Discussion board updates
Last Updatedcolumn
To create a view of only the discussions that were updated in the past month, apply the filter Last Updated is greater than [Today]-30.
Archive files in a document library
Date Modified
To create a view of documents not changed since the year 2006 (that you then want to archive), apply the filter Date Modified is less than December 31st 2006
Find a subset of financial data
RegionYear (as two simple indexes)
To create a view of financial data for the NorthEast region in 2008, apply the filter Region equals "NE" AND Year equals 2008 because the Region column is likely to have less values than the Year column
Find links to more information about creating or changing views and about formulas and functions that you can use to filter views in the See Also section.
 NOTE   Even when you create a filtered view based on column indexes, certain additional operations might still be blocked because they require accessing the entire list or library. These operations include the following: adding or deleting an index, creating a sort in the view definition; displaying a column total; and adding, updating, or deleting calculated fields.
Organizing items into folders
Although folders (also called containers) are not required to use large lists and libraries, you can still use them to help organize your data and improve the efficiency of your data access. When you create a folder, behind the scenes you are creating an internal index. This internal index is also created for the root folder, or top-level of a list or library. When you access items in a folder, you are effectively using this internal index to access the data. Bear in mind that if a folder contains subfolders, each subfolder is counted as an item (but not any items in that subfolder).
Even when the total number of items in a list or library is very large, a view of a single folder is at least as fast as a view that filters the total number of items by using an indexed column. In some scenarios, it may be possible to distribute all of the items in a list or library into multiple folders such that no folder has more than five thousand items.
It is important to consider the following when you use folders to organize a large list or library:
·         A folder can contain more items than the List View Threshold, but to avoid being blocked by SharePoint 2010, you may still need to use a filtered view based on column indexes.
·         If you choose the Show all items without folders option in the Folders section when you create or modify a view in this list or library, you must then use a filter that is based on an index to ensure you don't reach theList View Threshold.
·         It is often useful to make the default view show all the available folders without any filtering so that users can choose the appropriate folder when they insert new items. Displaying all the folders also makes it less likely that items will be incorrectly added outside the folders in the list or library. Note that, unlike libraries, there is no automatic way to move items between folders in a list.
Controlling the display of data on the page
Once the subset of data has been transferred to your computer, there are additional strategies you can use to speed up the display of the data in your browser.
As you increase the size of the HTML source code behind a web page, there is usually a corresponding increase in time to render the page. The larger the page, the longer it takes to display. For example, the default style for a list or library view is the table style, which displays all the data, one item per table row, on one page. Therefore the more rows you display, the longer it takes to display the data.
Setting an item limit on a page
By setting an item limit on a page, you can directly control how many items display at one time by using a specific number, the default being 30.
There are two options you can choose when setting an item limit on a page:
·         If you select Display items in batches of the specified size (for paging), then you are creating a page-by-page display of the data which is useful when you are browsing items in a non-interactive way.
·         If you select Limit the total number of items returned to the specified amount, then you are creating a hard limit that may or may not return the entire results of your filter operation, but may be useful when testing your view, creating a prototype, or you only want to retrieve the top values in a view.
Using the Preview Pane view style
To see all of the information from items in your list in a vertical pane format, use the Preview Pane style. You can more quickly navigate the data by hovering over the item title in a scrolling region on the left side of the page, to see all the column values of the current item vertically listed on the right side of the page. Less initial data is displayed, and this helps display the view faster. This style is also very useful when your list is wide or has many columns and would require horizontal scrolling to see the data.
Using the Group By view style
The Group By view style groups the data by a category (up to two levels) that you choose when defining the view, which makes the initial display of the list view much smaller. You click a Plus/Minus button to expand/collapse a specific group of data and only load the subset of data that is needed onto the current page. This view style not only limits the display of the data, but also the retrieval of the data. Make sure that the default Show Grouping option is set to Collapsed. You can also control the paging of the grouped data in the same way as data that is not grouped. In some situations, such as simple reporting or drilling down and drilling up to find and update data, this view style can be much more efficient for users to use.
Using Datasheet view
Datasheet view is based on an ActiveX control and specifically designed to load data faster by retrieving and updating data asynchronously. This view makes it much more efficient to scroll, edit, and work with a lot of data, and is especially useful for doing frequent adds, updates, and comparisons to your data. Without disrupting your work, your changes are sent to the server, status is communicated to you by using icons, your changes are efficiently synchronized to the server and you can quickly resolve occasional conflicts. Datasheet view respects view filter, sort, and folder settings, but is not designed to work with page limits, dynamic filters and sorts in Standard view, or other non-tabular view styles, such as Group By and Preview Pane.

 TIP   If your list is organized by folders and you would like to see all the data in Datasheet view without the folders (a "flattened" view), you can set the view to see all items without folders. However, you may need to use a filter that is based on indexed columns to avoid reaching the List View Threshold.
Using the daily time window
The daily time window is a way for farm administrators to specify a dedicated time period for all users to do unlimited operations during off-peak hours, such as 6:00 PM to 10:00 PM.
Although views are a primary way to retrieve items from a list or library, other SharePoint commands and operations also require accessing the items in a list or library, such as: adding and deleting an index, adding and modifying a list column, deleting and copying folders, changing security settings for a list or library, saving a list with its content as a template, deleting a web or site, and restoring or deleting items from the Recycle Bin. These commands and operation could also fail if they exceed the site thresholds and limits when retrieving list data.
If you are not a developer or don't have convenient access to developer resources to do these operations for you, you can defer these types of operations to the daily time window. If the regular daily time window period is set up during normal sleeping hours, you can ask the farm administrator to move the time period to an early morning or late evening hour.

No comments:

Post a Comment