Mar 17, 2009

TYPO3 extension update: Google sitemap

The new version of the Google sitemap extension just appeared in TER. Earlier this extension was meet very enthusiastically by the community. This version fixes some issues reported by users. In particular these issues were fixed:
  • if root page is a shortcut, sitemap will be truncated to a small amount of pages
  • tt_news sitemap is empty if home page is a shortcut or if the news sysfolder is outside of the web site root
  • tt_news articles are not indexed by Google
The first two points are simple. I describe how to avoid the shortcut problem right below. The last problems needs a deeper explanation, which I provide closer to the end of this article.

Avoiding the shortcut problem

Firsts, it should be clear that the problem with shortcuted home page is not a problem that dd_googlesitemap may solve on its own. When the request comes, TYPO3 checks if page is a shortcut. If it is, TYPO3 will automatically resolve the shortcut to its destination page. Thus Google sitemap will run in the shortcut target page and show all subpages in the sitemap.
So how to show all pages of the site if the home page is shortcuted?
It is quite simple. There is a new option named tx_ddgooglesitemap.forceStartPid. Set it to the root page id of the site. That's it. The problem is solved.
Now let's get to the tt_news story.

Sitemap types and Google

Google supports two types of the sitemap:
  • "normal" sitemaps (yes, this is how they are really called!)
  • "news" sitemaps.
Normal sitemaps are what most people use. They work for the regular Google indexer.
News sitemaps are for news. The least known fact is that Google indexer does not take news sitemap into account at all. News sitemaps are for Google Reader only!
The image below shows URL statistics for the news sitemap I submitted to Google. 'Indexed URLs in Sitemap' field shows zero. This fact made me suspicious and I did a little hunt on the Internet. It appears that tt_news should be presented using a "normal" sitemap.

By default dd_googlesitemap will now show tt_news items using "normal" sitemaps. If news should be read also through the Google Reader, it is possible to submit them again with the same sitemap URL and &type=news appended to the URL. This will force dd_googlesitemap to generate "news" sitemap.
Questions? Please, post them to comments.

31 comments:

  1. > "if root page is a shortcut, sitemap will be truncated to a small amount of pages"



    *sigh* I thought I was too dump to set up a "no config" extension...



    Now I set the tx_ddgooglesitemap.forceStartPid to the last shortcut page which has the pages as childs and it seems to work.

    ReplyDelete
  2. Thanks for #2862! Exactly what I was missing.

    ReplyDelete
  3. Thanks for the update!

    ReplyDelete
  4. Hi,



    Thanks for fixing the shortcut issue!



    Can you also make it honor 'not in menu' and 'no search'. That still does not seem to work for me.



    Cheers!

    ReplyDelete
  5. 'No search' should be excluded by now. I implemented it yesterday.



    I am not sure about 'Not in menu'. This has nothing to do with indexing. 'No search' pages should not be indexed but 'not in menu' just does not appear in the TYPO3 menu. They are still searchable by Google.

    ReplyDelete
  6. Alessandro MazzoccoMarch 18, 2009 at 2:04 PM

    Hi,

    I use new version of the plug-in but I've the same problem like the previus version.

    I can't generate the sitemap becouse the the xml site file has only home page and the xml news file has nothing.

    Do you have suggestions?

    Thanx

    ReplyDelete
  7. Alessandro MazzoccoMarch 18, 2009 at 6:20 PM

    Hi,

    problem solved, great.

    The problem was caused by the cache.

    Now everything works.

    Thanks

    ReplyDelete
  8. Alessandro MazzoccoMarch 19, 2009 at 9:49 PM

    Hi,

    I tested the plug-in and I found a problem.

    In the log page i found infinite record like this:

    Time User Type E Action Details

    19:44 admin@LIVE DB Sys! Update MySQL error: 'Unknown column 'tx_ddgooglesitemap_lastmod' in 'field list'' (pages:335) (msg#1.2.12)

    I think there is a loop because the message is infinite.

    Regards

    ReplyDelete
  9. Around line 139 of class.tx_ddgooglesitemap_pages.php the code checks for $pageInfo['doktype'] and $pageInfo['no_search'], but the only values present in $pageInfo are:

    [uid]

    [SYS_LASTCHANGED]

    [tx_ddgooglesitemap_lastmod]



    So even though I set a page to no_search, it still ends up in the sitemap.



    Cheers

    ReplyDelete
  10. Hi Dmitry,



    Thanks for fixing the shortcut issue. For me it works like a charm.



    Cheers!

    ReplyDelete
  11. Michiel, are you sure you use latest version? ;)

    ReplyDelete
  12. Ah, the latest version is much better thank you!

    ReplyDelete
  13. Hi,

    I'm using latest version from Ter (1.0.3).

    It solves the shortcut problem for the root page using tx_ddgooglesitemap.forceStartPid.

    I've a problem for all pages of type shortcut in the pagetree: they are not inserted in the xml sitemap at all. Any idea?

    Thanks

    ReplyDelete
  14. Rino, this is corect. Shortcut pages are not inserted to the sitemap. Their destination pages are inserted. Inserting a shortcut page only makes the sitemap larger. It lowers the chances that Google will index real pages. So it is good not to have shortcut pages in the sitemap.

    ReplyDelete
  15. Hi Dmitry,



    I have support Rino a little bit, since i'm having the same issue. For example:



    Root (shortcut, indexed)

    - Home (indexed)

    - Contact (indexed)

    - Projects (shortcut, not indexed)

    -- Category A (not indexed)

    --- Project A (not indexed)

    --- Project B (not indexed)

    -- Category B (not indexed)

    ..... etc

    - Disclaimer (indexed)



    Hopefully there's a solution nearby.

    Thanks in advance. Robert

    ReplyDelete
  16. Thanks for the update!



    How can Change http://example.com/?eID=dd_googlesitemap to http://example.com/sitemap.xml



    Anderl

    ReplyDelete
  17. Dear Dmitry

    I experience the same as Robert - that subpages to shortcut pages are not in the xml output.



    I agree with Michiel Roos that 'not in menu' pages should not be listed in the sitemap. We omit these pages from the menu on purpose and often do not want direct access to those. If we do add links to those pages from anywhere in the site - spiders will find likely find them. So the upsides of omitting 'not in menu' from sitemaps outweigh the downsides.

    ReplyDelete
  18. Hi Dmitry, I understand that the tx_ddgooglesitemap.forceStartPid solves the problem for the page that is a short-cut and its sub-pages. But what if there are multiple pages that are short-cuts? Do you need to define the forceStartPid for each of those? If you forget one, the subpages will not show up in the sitemap. Would it not be better to just search the whole pagetree automatically, include all normal pages in the xml tree and just ignore all shortcuts? seo_basics does this correctly. Otherwise, great extension, thanks!

    ReplyDelete
  19. Dear mr Dulepov,

    Is it me or is it the extension, it doesnot work ona 4.3 I have...

    It has something to do with: /cache/ instead of /cache/backend/

    with regards,

    Piet Roorda

    ReplyDelete
  20. I can't get it work with 4.3.0:



    Fatal error: require_once() [function.require]: Failed opening required 'D:/www/t3dev/t3lib/cache/class.t3lib_cache_abstractbackend.php' (include_path='.;C:\php5\pear') in D:\www\t3dev\typo3conf\ext\dd_googlesitemap\class.tx_ddgooglesitemap_eid.php on line 132



    Read more: http://dmitry-dulepov.com/article/extension-updates-pagepath-and-dd-googlesitemap.html#comments-form#ixzz0dNN782tA

    ReplyDelete
  21. Thanks for all your brilliant work, Realurl rocks.!!I have 4.3 and seobasics I get an error message The page is not configured! [type= 776][] with my smartphone site. Thanks again for Realurl its exobrill.

    ReplyDelete
  22. Joost van BerckelApril 2, 2010 at 10:38 PM

    Thanks Dmitry,



    It works for me!



    Thx

    Joost van Berckel

    ReplyDelete
  23. I have site with shortcut, set forceStartPid to first "real" page, but it doesn't work. All what I get is the first page in the sitemap. Any hints?

    ReplyDelete
  24. with the "TYPO3 Introduction Package" version 4.4.4 i have only one entry in sitemap.xml

    http://bibliogum.kodingen.com/typo3/index.php?eID=dd_googlesitemap

    is there a way to fix it ?

    thanx

    ReplyDelete
  25. Hello Dmitry,



    my sites are:



    root-page id=0

    Shortcut id=1 (tx_ddgooglesitemap.forceStartPid = 1)

    Sys-Folder id=2

    Site1

    Site2

    Site3

    ...



    The Sitemap shows only ""



    If i change tx_ddgooglesitemap.forceStartPid to 2, the sitemaps shows an entry ONLY for "Site1".



    What´s wrong? Thank you, adino

    ReplyDelete
  26. Hi! Great extension, tnx, worx fine (with one site-language ;)

    How to do, when I have multiple languages in one tree?
    domain/bla.html is native language
    domain/en/blu.html is second language

    the ext only produce a site map for native language...

    ReplyDelete
  27. After update extension dd_googlesitemap to version1.0.6 Typo3 4.5.19 , some of pages listed in menu,
    and their subpages are not shown in dd_googlesitemap. Thus , these page are shown with an online
    sitemap generator .
    Is a way to have them listed in dd_googlesitemap?

    ReplyDelete
  28. Hi Dmitry,

    is there a solution to generate only translated page URLs within a multilanguage setup?
    I am using the &L= parameter for my 8 languages, but only L0 and L1 are 100% filled. The pages are only partialy translated in the other languages but the extension takes the page name from L0 and builds the URL with L3 switch, like "domain.com/fr/kontakt" instead of "domain.com/fr/contactez"

    ReplyDelete
    Replies
    1. I'm also interested in that specific request.

      Delete
    2. Update: Did it on my own. Just add a check for non translated pages querying the pages_language_overlay for entry with the pages uid/pid and your L parameter. If you don't want "hide in menus" pages listed, just add $GLOBALS['TSFE']->pages['nav_hide'] to the pageInfo-Array and check for that too.


      So long
      Gion

      Delete