Overview on what a binlog is


The binary log contains “events” that describe database changes such as table creation operations or changes to table data.
It also contains events for statements that potentially could have made changes (for example, a DELETE which matched no rows),
unless row-based logging is used.
The binary log also contains information about how long each statement took that updated data.

The binary log has two important purposes:
  • For replication, the binary log is used on master replication servers as a record of the statements to be sent to slave servers.
    The master server sends the events contained in its binary log to its slaves, which execute those events to make the same data
    changes that were made on the master. See Section 16.2, “Replication Implementation”.
  • Certain data recovery operations require use of the binary log. After a backup has been restored, the events in the binary log
    that were recorded after the backup was made are re-executed. These events bring databases up to date from the
    point of the backup. See Section 6.5, “Point-in-Time (Incremental) Recovery Using the Binary Log”.


Running a server with binary logging enabled makes performance slightly slower.
However, the benefits of the binary log in enabling you to set up replication and for
restore operations generally outweigh this minor performance decrement.

 

Binlog main settings

The binlog is controlled by few settings in the my.cnf file, let quickly see them (credits MySQL manual):

log-bin[=base_name]
Enable binary logging. The option value, if given, is the basename for the log sequence.
The server creates binary log files in sequence by adding a numeric suffix to the basename.
It is recommended that you specify a basename, otherwise, MySQL uses host_name-bin as the basename.

Parameter to do or ignore databases

binlog-do-db=db_name
binlog-ignore-db=db_name
replicate-do-db
replicate-ignore-db

sync_binlog
-------------
If the value of this variable is greater than 0, the MySQL server synchronizes its binary log to disk (using fdatasync()) after every sync_binlog writes to the binary log.
There is one write to the binary log per statement if autocommit is enabled, and one write per transaction otherwise.
The default value of sync_binlog is 0, which does no synchronizing to disk—in this case, the server relies
on the operating system to flush the binary log's contents from to time as for any other file.
A value of 1 is the safest choice because in the event of a crash you lose at most one statement or transaction from the binary log.
However, it is also the slowest choice (unless the disk has a battery-backed cache, which makes synchronization very fast).

binlog_cache_size
-----------------
The size of the cache to hold the SQL statements for the binary log during a transaction.
A binary log cache is allocated for each client if the server supports any transactional storage engines and if the server has the binary log enabled (--log-bin option).
If you often use large, multiple-statement transactions, you can increase this cache size to get more performance.


max_binlog_cache_size
---------------------
If a multiple-statement transaction requires more than this many bytes of memory, the server generates a Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage error.
The minimum value is 4096.


max_binlog_size
---------------
If a write to the binary log causes the current log file size to exceed the value of this variable, the server rotates the binary logs(closes the current file and opens the next one).
The minimum value is 4096 bytes. The maximum and default value is 1GB.

expire_logs_days
----------------
The number of days for automatic binary log file removal. The default is 0, which means “no automatic removal.”
Possible removals happen at startup and when the binary log is flushed.


An example of usage of all of them:

##############
#REPLICATION
#############
log_bin = /var/mysql/log/binlog51/mysql-bin
relay-log = /var/mysql/log/relaylog/relay

expire_logs_days = 30
max_binlog_size = 500M
log-slave-updates = 1
max_binlog_cache_size = 4294967295
sync_binlog = 1

#skip-slave-start
replicate-ignore-db=mysql


In this case what we have define is an automatic purge of the Binary logs after 30 days.
30 days is a safe window for the replication to happen, so there should not be any issue here in removing the binary logs.
Not always we are so lucky, and some time we have to clean up our disks to save space.

In this case is a good idea to set-up a procedure to perform checks against the slaves before performing a purge, you can do it by yourself
or trust one of the many script on the web, good starting point is Maatkit.

Now let us assume that for any reason you have purged a binary log from the master, and that the slave was not already there.

What will happen is that you will receive a very nice message from MySQL replication:
"Last_Error: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted
(you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log),
a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to
know their names by issuing 'SHOW SLAVE STATUS' on this slave."

Nothing unexpected ... you have deleted it with the PURGE so log not there anymore.
Assuming you have the backup of the purged binlog (as it should ALWAYS be), copy back the file(s), and edit the <binlog>.index file, adding the missed entry previously PURGED.

Then check if it sounds fine:

mysql> show binary logs;


It will show the added entry.


Well Nice ALL done !!!

Assuming from that we could restart the slave... So let us try.
What we will get is something frustrating:

...
100929 11:21:47 [Note] Slave SQL thread initialized, starting replication
in log 'mysql-bin.002671' at position 1073741674, relay log './xxxxxxxxxxxx-relay-bin.000001' position: 4
100929 11:21:47 [Note] Slave I/O thread: connected to master This email address is being protected from spambots. You need JavaScript enabled to view it.:3306',
replication started in log 'mysql-bin.002671' at position 1073741674
100929 11:21:47 [ERROR] Error reading packet from server: Could not find first log
file name in binary log index file ( server_errno=1236)
100929 11:21:47 [ERROR] Got fatal error 1236: 'Could not find first log file name
in binary log index file' from master when reading data from binary log
100929 11:21:47 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.002671', position 1073741674
100929 11:22:08 [Note] Error reading relay log event: slave SQL thread was killed

 

 

Why this happens?

The situation seems a little bit confusing, right?
The MySQL server say that it CAN see them, but files are not valid for the replication process, so what is going on?

Well first of HOW I DO FIX it this will help us to understand.
Here a comment, I got comment form some MySQL DBA saying that the way is to RESTART MySQL, true if you do that it will fix it,
but you DO NOT NEED TO restart MySQL in this case, instead do the right thing:

mysql> FLUSH LOGS;

 

Then restart the slave process on the SLAVE:

mysql> SLAVE START;

mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: xxxxx.xxxxxx.xxx
Master_User: xxxx
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.xxxx
Read_Master_Log_Pos: xxxxx
Relay_Log_File: xxxxxxxxxxxx-relay-bin.xxxx
Relay_Log_Pos: xxxxxx
Relay_Master_Log_File: mysql-bin.xxx
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 131915378
Relay_Log_Space: 563985741
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 946273
1 row in set (0.00 sec)

 

WOW it works ..., well this is expected, but I want to spend one second on WHY it works.

What we have to do in this cases is to read the source code and once more see what is happening there.

Reading the code we will see that FLUSH LOGS, will also call this function, which will instanciate a new MYSQL_LOG object.

The point is all here.

 if (log_type == LOG_BIN) 
{
if (!no_auto_events)
{
/*
We log the whole file name for log file as the user may decide
to change base names at some point.
*/
THD *thd = current_thd; /* may be 0 if we are reacting to SIGHUP */
Rotate_log_event r(thd,new_name+dirname_length(new_name),
0, LOG_EVENT_OFFSET, 0);
r.write(&log_file);
bytes_written += r.data_written;
}
/*
Update needs to be signalled even if there is no rotate event
log rotation should give the waiting thread a signal to
discover EOF and move on to the next log.
*/
signal_update();
}
old_name=name;
save_log_type=log_type;
name=0; // Don't free name
close(LOG_CLOSE_TO_BE_OPENED | LOG_CLOSE_INDEX);

The function will close the current BINLOG and also the the <binlog>.index associated, then will ROTATE
the binarylog, and create new one with a new name, finally reopen a new <binlog>.index.

/* reopen index binlog file, BUG#34582 */
if (!open_index_file(index_file_name, 0))
open(old_name, save_log_type, new_name_ptr,
io_cache_type, no_auto_events, max_size, 1);
my_free(old_name,MYF(0));

 

Lesson Learn

===============
Also if MySQL allow us to do nice changes on the fly, editing and modifying the text files like binlog.index,
internally these changes are not considered valid unless the descriptors for these objects are refresh.
Refresh that is done in different way one of them is using FLUSH LOGS.

Final Consideration

===================
So far so good and nothing really new except that we have done another dig in MySQL code, to understand what and how it works.
There will be much more to say about, and I admit that I am doing it simpler then it is, but it was not my intention to write an article
on code digging, but only showing how also simple and well know actions like the one here reported, could be misleading, if we do not
have a better understanding.
This also put in evidence that it would have being much better:

  1. refresh the binlog.index automatically when it is modify
  2. do not accept/show the binlog.index modified IF there is not an internal refresh, this is confusing, misleading and not consistent.


Hope this (simple) issue will be fix by Oracle in on of the many code review to go.

 

 

void MYSQL_LOG::new_file(bool need_lock)
{
char new_name[FN_REFLEN], *new_name_ptr, *old_name;
enum_log_type save_log_type;

DBUG_ENTER("MYSQL_LOG::new_file");
if (!is_open())
{
DBUG_PRINT("info",("log is closed"));
DBUG_VOID_RETURN;
}

if (need_lock)
pthread_mutex_lock(&LOCK_log);
pthread_mutex_lock(&LOCK_index);

safe_mutex_assert_owner(&LOCK_log);
safe_mutex_assert_owner(&LOCK_index);

/*
if binlog is used as tc log, be sure all xids are "unlogged",
so that on recover we only need to scan one - latest - binlog file
for prepared xids. As this is expected to be a rare event,
simple wait strategy is enough. We're locking LOCK_log to be sure no
new Xid_log_event's are added to the log (and prepared_xids is not
increased), and waiting on COND_prep_xids for late threads to
catch up.
*/
if (prepared_xids)
{
tc_log_page_waits++;
pthread_mutex_lock(&LOCK_prep_xids);
while (prepared_xids)
pthread_cond_wait(&COND_prep_xids, &LOCK_prep_xids);
pthread_mutex_unlock(&LOCK_prep_xids);
}

/* Reuse old name if not binlog and not update log */
new_name_ptr= name;

/*
If user hasn't specified an extension, generate a new log name
We have to do this here and not in open as we want to store the
new file name in the current binary log file.
*/
if (generate_new_name(new_name, name))
goto end;
new_name_ptr=new_name;

if (log_type == LOG_BIN)
{
if (!no_auto_events)
{
/*
We log the whole file name for log file as the user may decide
to change base names at some point.
*/
THD *thd = current_thd; /* may be 0 if we are reacting to SIGHUP */
Rotate_log_event r(thd,new_name+dirname_length(new_name),
0, LOG_EVENT_OFFSET, 0);
r.write(&log_file);
bytes_written += r.data_written;
}
/*
Update needs to be signalled even if there is no rotate event
log rotation should give the waiting thread a signal to
discover EOF and move on to the next log.
*/
signal_update();
}
old_name=name;
save_log_type=log_type;
name=0; // Don't free name
close(LOG_CLOSE_TO_BE_OPENED | LOG_CLOSE_INDEX);

/*
Note that at this point, log_type != LOG_CLOSED (important for is_open()).
*/

/*
new_file() is only used for rotation (in FLUSH LOGS or because size >
max_binlog_size or max_relay_log_size).
If this is a binary log, the Format_description_log_event at the beginning of
the new file should have created=0 (to distinguish with the
Format_description_log_event written at server startup, which should
trigger temp tables deletion on slaves.
*/

/* reopen index binlog file, BUG#34582 */
if (!open_index_file(index_file_name, 0))
open(old_name, save_log_type, new_name_ptr,
io_cache_type, no_auto_events, max_size, 1);
my_free(old_name,MYF(0));

end:
if (need_lock)
pthread_mutex_unlock(&LOCK_log);
pthread_mutex_unlock(&LOCK_index);

DBUG_VOID_RETURN;
}

 

{joscommentenable}