How we found out why products suddenly disappear from the category frontend in Magento 2 and that prices randomly is shown as 0 (zero/free), also on the category pages
We experienced this issue related to several M2 launches for some of our larger clients very recently (Q2 2019), which happened to have a fair amount of configurable products (15–25k configurable, with 120–150k simple attached).
All affected installations ran the very latest, bright and shiny Magento Commerce 2.3.1 (the issue is confirmed in both Magento Open Source 2.3.1 and Magento Commerce Edition 2.3.1).
And yes, sure, the issue has indeed been reported as a support ticket to Magento with severity High.
Who are you guys?
We are some of the absolutely dinosaurs of Magento, having over the past 10+ years launched some 400+ M1 stores and some 40+ M2 stores, making us a heavy-weight Magento champion, with thousands of Magento issues won by knockout; some in round one, some in the final round 12.
This article was originally written by one of our senior developers; Kurt Inge Smådal.
Ok, so what is this price index issue all about?
So, Magento 2 has tons of indexes, which is a good thing, as they try to keep the site both perform- and respond faster.
All of the indexes are frequently updated. Some instantly. Some on a schedule.
All still good.
One of these indexes, the price index, is one of those who runs on schedule, and which — generally — only updates the products that has changes to them.
But, from time to time, this price indexing process will also do a full reindex, which is where it starts to get a bit tricky.
For a product to show in the frontend (e.g in the category list) there simply needs to be a valid entry for that product in the price index, which makes it absolutely crucial that the price index is up to date at any given time.
Furthermore, Magento will simply remove the product from the price index if it for some reason is Disabled or set to Out of Stock (this only applies if you have set up Magento to not show Out of Stock-products).
So, the issue with this is two-fold;
- It takes hours to reindex for larger cataloges with some tens of thousands of products (which aren't that many products), which “occupies” the index and also adds unnecessary load to the server(s)
- Much worse, products disappear from categories during the reindex process, and, products might be shown as 0 / zero priced products during a short period of time as the price index is running
Understanding the issue
To understand this issue we had to — like we often have to — “reverse engineer” how things in Magento are architectured and built, so given the nature of this issue and its severity, we thought it was worth a Medium article with some additional in-depth on it all.
So if you are up to learn more about the indexes in Magento, and particular the price index, and to — yay — get the solution to a major issue, fill your cup of coffee and continue reading.
A deep-dive into the price index process
The price index process is running for each and every product type;
simple, virtual, downloadable, configurable, bundle and grouped (and 'giftcard' if you are up on Magento Commerce).
The product types are defined in the file product_types.xml
like this:
<type name="simple" label="Simple Product" modelInstance="Magento\Catalog\Model\Product\Type\Simple" indexPriority="10" sortOrder="10">
<indexerModel instance="Magento\Catalog\Model\ResourceModel\Product\Indexer\Price\SimpleProductPrice" />
<customAttributes>
<attribute name="refundable" value="true"/>
</customAttributes>
</type>
As you can see, the indexerModel and the indexPriority is defined in this file.
Lets take a closer look at them.
First, the indexPriority is defining the order of when the product type is processed, which in a standard Magento 2 installation is;
- downloaded
- simple
- virtual
- configurable
- bundle
- grouped
One thing to notice is that the non-composite product types is processed first, and then the composite ones.
This means that even if you set bundle products (which is a composite product type, as it contains a “grouped set” of products) as indexPriority 1, it will nevertheless not run before simple, virtual and downloadable products (in Magento Commerce there is also giftcard which is processed right after virtual).
This is due to how the composite products calculate their prices — by using the catalog product index to find the correct prices for the simple products attached to the composite product.
The indexing process for the non-composite product index
The price index process for non composite products (simple, virtual, downloadable) is luckily rather straight forward; The process will create a temporary table (catalog_product_price_index_temp
) to which it inserts all the relevant products.
The process will then go through this table and modify the prices. This is done by calling all the classes that is defined as price modifiers on the Magento\Catalog\Model\ResourceModel\Product\Indexer\Price\PriceInterface
.
The price modifies has to implement the PriceModifierInterface
.
An example of a price modifier definition is from catalog inventory:
<type name="Magento\Catalog\Model\ResourceModel\Product\Indexer\Price\PriceInterface">
<arguments>
<argument name="priceModifiers" xsi:type="array">
<item name="inventoryProductPriceIndexFilter" xsi:type="object">Magento\CatalogInventory\Model\Indexer\ProductPriceIndexFilter</item>
</argument>
</arguments>
</type>
All the price modifiers will be running on the products in the catalog_product_price_index_temp
table.
When this is done the catalog_product_price_index_replica
table is updated from the catalog_product_price_index_temp
table.
The “replica” tables are used during reindex to avoid locking issues.
This price modifier select all the products that is Not in Stock and deletes then from the catalog_product_price_index_temp
table.
This is done in batches, defined in the di.xml
file, and the default here is 200.
The indexing process for composite product index
The price index process for composite products on the other hand (configurable, bundle, grouped) is as you can imagine a bit more complex.
This process will also create a temporary table (catalog_product_price_index_temp
) where it insert all the products into.
It will then run the same price modifiers that the non-composite product index is running (removing all Not in Stock products).
Next, it will create a second temporary table where it selects from the catalog_product_price_index_replica
table and inserts
all the simple products that the composite product is build up by and then uses this table to find the max and min price of the composite product.
When this is done the catalog_product_price_index_replica
table is updated from the catalog_product_price_index_temp
table.
Batching to avoid large memory consumption
All the indexing is done through batching to avoid large memory consumption.
The batch size for each product type is defined in the di.xml
file, with the default batch size of 5000 items pr batch.
<type name="Magento\Catalog\Model\ResourceModel\Product\Indexer\Price\BatchSizeCalculator">
<arguments>
<argument name="batchRowsCount" xsi:type="array">
<item name="default" xsi:type="number">5000</item>
</argument>
<argument name="estimators" xsi:type="array">
<item name="default" xsi:type="object">Magento\Catalog\Model\Indexer\Price\BatchSizeManagement</item>
</argument>
</arguments>
</type>
There is also a rowsize estimator that calculate the memory size for all rows pr product in the temp tables and set the temp table size for the current session accordingly.
This function will give a warning in the log files if the batches uses more than 20% of innodb_buffer_size.
For non-composite products, the number set in batchRowsCount is the accurate count used when selecting products from the catalog_product_entity
table.
For composite products there is batchSizeAdjusters
that will calculate the batch size.
<argument name="batchSizeAdjusters" xsi:type="array">
<item name="configurable" xsi:type="object">Magento\Catalog\Model\ResourceModel\Product\Indexer\Price\CompositeProductBatchSizeAdjuster</item>
</argument>
Now it starts to get a little exciting.
What this function does is that it will find the product with most product relations, and then divide the given batchRowsCount by this number.
Example: If you have a configurable product that has 100 product relations (variants), it will divide the batchRowsCount by 100 for all composite product types (or all product type that uses the Magento\Catalog\Model\ResourceModel\Product\Indexer\Price\CompositeProductBatchSizeAdjuster
batch size adjuster).
For some most likely (at least hopefully) logical reason, Magento has default'ed this to 5000 for all product types except bundle products, which is set to… 136..?
This value will however as mentioned vary between configurable, bundle and grouped, and you can customize it yourself by editing your di.xml
file:
<type name="Magento\Catalog\Model\ResourceModel\Product\Indexer\Price\BatchSizeCalculator">
<arguments>
<argument name="batchRowsCount" xsi:type="array">
<item name="simple" xsi:type="number">1000</item>
<item name="configurable" xsi:type="number">100000</item>
<item name="bundle" xsi:type="number">100000</item>
</argument>
</arguments>
</type>
It is not always a good idea to simply increase the batch sizes to a high number, as it in some cases can cause a lower performance.
In our experience, actually lowering the batchRowsCount for simple products in many cases increased the performance.
The batchSize
There is also another batchSize setting that is important to be aware of; the ProductPriceIndexFilter
batchSize.
The batchSize will determine how many products should be queried from the stock table when it is removing sold out products from the catalog_product_price_index_temp
table.
An example for this would be that if you have 2000 sold out products in your stock table, and you have a batch size of 200, then there will be 10 delete queries in on the catalog_product_price_index_temp
table for each batch inserted into it.
Increasing the batchSize for
ProductPriceIndexFilter
seem to increase performance if you have a large catalog with many sold out products.
Trying to make some conclusions
Even if you only have none-composite products, the price reindex can take a lot of time if you have a large product catalog.
The product price index filter is running on all batches — even though the catalog_product_price_index_temp
is empty and there are Sold Out products to delete.
Real-world example:
A customer with around 110K simple products and 7.5K configurable products took around 2 hours 30 minutes to do a full price reindex.
Worse, even though there where none bundle nor grouped products, we noticed that most of the time for reindexing went to the indexing of these product types.
W-w-why? Because the bundle- and grouped product indexer only ran batches of 16 products on each run, and for every run — even though it didn’t insert any bundle products into the catalog_product_price_index_temp
table — it would still run the ProductPriceIndexFilter
, which would on its own run batches of 200 sold out products which it tried to delete from the empty catalog_product_price_index_temp
table. Gah!
So, with around 120K products in the catalog, this would be (120K/16) 7500 queries that tried to fetch products.
And with around 85K old products in the stock table that is marked as Sold Out there would also be for each of these 7500 queries another set of (85K/200) 425 queries that tries to delete Sold Out products from
the empty catalog_product_price_index_temp
table. Gah²!
And this would then be repeated twice (once for bundle and once for grouped). Gah³!
So, the grand-finale solution?
After adjusting the batchRowsCount for the BatchSizeCalculator
and batchSize for the ProductPriceIndexFilter
, the run time went down significantly. From around 2 hours 30 minutes to around 15 minutes.
Any lesson learned?
Never trust the default settings.