Feb 27, 2009

RealURL: separate language domains in an easy way

TYPO3 fully supports the making of multilanguage web sites. Everyone know the L URL parameter. But what if you want to have a separate domain for each language and keep a single page tree? Until recently it was not easy and required certain TYPO3 black magic. In this article I will describe how to do it in 5 minutes (or less!) and give you a bonus: better site position in Google.
Typically when user switches to another language, the URL changes to contain L var ('&L=1'). In case of RealURL it looks better, URL will become http://example.com/en/. This is nice looking. So far, so good.
The problem is that Google and other search engines prefer domains from the user's domain zone. If relevancy is the same, Google will prefer "es" domain to "com" domain for Spain and "de" domain for Germany. It becomes essential to have domains in the corresponding country for better search engine optimization. Registering domains is easy but how to make TYPO3 use them and keep the single page tree?
Solving this task means removing language segments from URL and converting http://example.com/de/mypage to http://example.de/mypage/.
It is possible to have language domains with TYPO3 after doing certain non–trivial TypoScript magic. Alternatively it is possible with some coding. However there is a much better, faster and more simple solution. It is brought to you by RealURL (surprise, surprise!)

Feb 26, 2009

Set SVN properties for TYPO3 extensions

TYPO3 core uses certain coding conventions. One part of these conventions is Unix line ending and file information block form

Unix line endings

Unix line ending are standard in TYPO3. It is easy to enforce them with SVN. SVN has a property named 'svn:eol', which defines what line endings a file gets during checkout. For TYPO3 core and extensions this property should be set to LF. Here is how to do it from the command line (shell):
svn ps svn:eol LF class.tx_myext_whatever.php
All text files inside the extension should have this property.

Feb 25, 2009

A grave mistake with passwords

One grave security–related mistake that I saw twice this month was related to passwords.
You are aware of the TYPO3 vulnerability that allows attackers to read localconf.php file. Lots of sites were attacked and hackers got passwords to some of them.

Camino works faster than FireFox on Mac

Camino is a Gecko–based browser for Mac. It is a little brother (sister?) of FireFox. But unlike FireFox 3 it is works at least two times faster. Why is that? No idea. In addition it takes less memory and does not seem to leak it. Fonts look better in Camino.

I decided to use Camino after FireFox hanged up once again and destroyed all my tabs (despite of session saver extension).

I found only two problems with Camino.

Firsts, it does not import FireFox bookmarks correct. It looks only in the "default" profile ("default.profile" folder). I have FireFox 2 with that profile and FireFox 3 with another profile. Camino does not prompt me what profile to import and gets FireFox 2 bookmarks.

Secondly, it is not possible to install FireFox extensions to Camino. On other hand it can be good: the less extensions, the faster it runs.

1Password is supported for Camino too, so I will not loose any saved login.

I also heard that Opera works fast and well on Mac but I could not get used to it on PC and I do not think anything will change for Mac. So my default browser for now is Camino.

Feb 24, 2009

Another Google sitemap TYPO3 extension

Today morning I sent a Google map extension to the TYPO3 extension repository. I had it for some time already on this and some other sites of mine.
Sitemaps help Google to index your site better and faster. There are only advantages of having them in the TYPO3 web sites.
There several extensions that implement such functionality. Why yet another extension? Here is why:
  • it correctly works with config.baseURL, config.absRefPrefix and domain records. Every other extension does not work with at least one of these settings and creates URLs with host name prepended twice to the URL (invalid URL in the sitemap!)
  • none of the existing extension currently create news sitemap correctly
  • all existing extensions are implements as normal plugins and thus do not perform well enough
My extension is better because it:
  • correctly creates URLs
  • adds a date of last page modification
  • calculates page change frequency based on history of page updates
  • creates news URL correctly
  • uses eID feature, which offers much less load on the server

Feb 23, 2009

How HTML markup affects Google search results

No one knows exactly how Google ranking features work. Google keeps it hidden for obvious reasons: Google wants to maintain better search relevancy and prevent optimizers from moving sites to a higher position in search results.

Several months ago I redesigned several of our sites. Yesterday I checked and found that these sites are on the top position in Google for some popular search terms. I did not expect that really. The content there is good and it is not changed much. However after the redesign site has moved up.

I think the reason is my changes to the markup. Earlier it had a lot of <div> and <table> tags. Now the markup reflects the logical organization of the information. This worked good and I want to share some principles of such design. Here they are:

How to replace a flexform field name without loosing data

Many years ago when I just started to work with TemplaVoila, I did my objects the same way as many people do: separate DS/TO pairs even if objects are similar. For example, on one site I had three types of recipes and I had three separate DS/TO pairs for them. I should have made a single data source and three different template objects instead.
It would not be difficult to change all template objects to use the same data source (just replace data source id for all flexible content elements). The real problem was that fields in data sources were named differently. The data was similar but filed names were different.
How to solve the problem in the most efficient way? Please, stop reading now and think.

Feb 18, 2009

Another bad HTTP request

Here is it:

HEAD /aprunasimies/forums/forum.php?id=2%23page-1 HTTP/1.1
User-Agent: User-Agent
If-Modified-Since: Mozilla/4.0 (compatible; Windows XP 5.1)

%23 is a # character (must not be in the URL sent to the server). User agent is fun and If-Modified-Since is even more fun...

No wonder that it is blocked.

The Internet is full of junk :(

Feb 17, 2009

Useless SQL found in TYPO3

Like any automated system TYPO3 sometimes produces redundant code. Most often it happens in SQL expressions. For example, it is common to see:

SELECT uid from tx_myext_table WHERE 1=1 AND deleted=0 AND hidden=0

The italic part is produced by TYPO3, so the extension's developer has to put "1=1" to make it valid SQL. This is understandable.

What I do not really understand is when people do it like this:

$GLOBALS['TYPO3_DB']->exec_SELECTquery('uid', 'tx_myext_table',
'1=1 AND pid=' . intval($conf['pid']) .

What is the point of the italic part of the where statement in this code example? Absolutely none.

So why it is there?

Attack on DNS servers

My DNS in under attack for two weeks already. This comes from the IP address daily several times per second:

Feb 17 19:44:32 angel named[14025]: client query (cache) './NS/IN' denied
Feb 17 19:44:35 angel named[14025]: client query (cache) './NS/IN' denied
Feb 17 19:44:35 angel named[14025]: client query (cache) './NS/IN' denied
Feb 17 19:44:38 angel named[14025]: client query (cache) './NS/IN' denied
Feb 17 19:44:38 angel named[14025]: client query (cache) './NS/IN' denied
Feb 17 19:44:42 angel named[14025]: client query (cache) './NS/IN' denied
Feb 17 19:44:42 angel named[14025]: client query (cache) './NS/IN' denied
Feb 17 19:44:44 angel named[14025]: client query (cache) './NS/IN' denied
Feb 17 19:44:44 angel named[14025]: client query (cache) './NS/IN' denied
Feb 17 19:44:48 angel named[14025]: client query (cache) './NS/IN' denied
Feb 17 19:44:48 angel named[14025]: client query (cache) './NS/IN' denied

I ban it and next day it comes from another unrelated address. Stupid and contr–productive.

Feb 15, 2009

Toata dragostea mea pentru diavola

After the recent security issues with TYPO3 I keep an especially close watch on my servers' mod_security logs. jumpurl atacks come from many IP addresses and they are already bore me. However today I saw something new and interesting:

Request: - - [15/Feb/2009:15:26:10 +0200] "GET /bug/login_page.php HTTP/1.1" 403 220 "-" "Toata dragostea mea pentru diavola" A2x6cn8AAAIAAEygTEwAAAAs "-"
Handler: redirect-handler
GET /bug/login_page.php HTTP/1.1
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Toata dragostea mea pentru diavola
Connection: Close
mod_security-message: Access denied with code 403. <rule is hidden> [severity "EMERGENCY"]
mod_security-action: 403

HTTP/1.1 403 Forbidden
Content-Length: 220
Connection: close
Content-Type: text/html; charset=iso-8859-1

Feb 13, 2009

How I tightened PHP security and broke three extensions

Given the recent security threats with TYPO3 I decided to tighten security a little more on my servers. I already run Apache and other services in chrooted environment with very restricted permissions. Now I decided to implement open_basedir setting. I did not use it before because I know that many extensions are not written with open_basedir in mind. But I decided to risk.

It worked well first. I set open_basedir to "/path/to/site/root/:/path/to/typo3_src/". This is how this and other my sites run now. All was ok.

Today one of the visitors noticed that one particular page is blank. For any more or less experienced TYPO3 developer it means that a PHP error happened and was not shown to the user. I have set devIPmask to my IP address and displayErrors to 2. This way I always see all PHP errors while visitors do not see them.

I went to the page and was very surprised by see an error in an extension of mine, which was not updated for half a year or so. Definitely it broke due to the open_basedir. I turned open_basedir off and page went back to normal.

But what could break? I always program extensions so that they do not go anywhere outside of web root.

Feb 12, 2009

Hackers are hunting!

Despite of many warnings some providers and companies are very slow on updating their TYPO3 web sites. There are reports about successfully hacked sites.

According to my logs hackers are hunting for web sites that still have vulnerable versions. If they find such sites, sites will be hacked. If you care about you business — upgrade your web site!