[ Index ]

PHP Cross Reference of Unnamed Project

title

Body

[close]

/se3-unattended/var/se3/unattended/install/linuxaux/opt/perl/lib/5.10.0/pod/ -> perlothrtut.pod (source)

   1  =head1 NAME
   2  
   3  perlothrtut - old tutorial on threads in Perl
   4  
   5  =head1 DESCRIPTION
   6  
   7  B<WARNING>:
   8  This tutorial describes the old-style thread model that was introduced in
   9  release 5.005. This model is deprecated, and has been removed
  10  for version 5.10. The interfaces described here were considered
  11  experimental, and are likely to be buggy.
  12  
  13  For information about the new interpreter threads ("ithreads") model, see
  14  the F<perlthrtut> tutorial, and the L<threads> and L<threads::shared>
  15  modules.
  16  
  17  You are strongly encouraged to migrate any existing threads code to the
  18  new model as soon as possible.
  19  
  20  =head1 What Is A Thread Anyway?
  21  
  22  A thread is a flow of control through a program with a single
  23  execution point.
  24  
  25  Sounds an awful lot like a process, doesn't it? Well, it should.
  26  Threads are one of the pieces of a process.  Every process has at least
  27  one thread and, up until now, every process running Perl had only one
  28  thread.  With 5.005, though, you can create extra threads.  We're going
  29  to show you how, when, and why.
  30  
  31  =head1 Threaded Program Models
  32  
  33  There are three basic ways that you can structure a threaded
  34  program.  Which model you choose depends on what you need your program
  35  to do.  For many non-trivial threaded programs you'll need to choose
  36  different models for different pieces of your program.
  37  
  38  =head2 Boss/Worker
  39  
  40  The boss/worker model usually has one `boss' thread and one or more
  41  `worker' threads.  The boss thread gathers or generates tasks that need
  42  to be done, then parcels those tasks out to the appropriate worker
  43  thread.
  44  
  45  This model is common in GUI and server programs, where a main thread
  46  waits for some event and then passes that event to the appropriate
  47  worker threads for processing.  Once the event has been passed on, the
  48  boss thread goes back to waiting for another event.
  49  
  50  The boss thread does relatively little work.  While tasks aren't
  51  necessarily performed faster than with any other method, it tends to
  52  have the best user-response times.
  53  
  54  =head2 Work Crew
  55  
  56  In the work crew model, several threads are created that do
  57  essentially the same thing to different pieces of data.  It closely
  58  mirrors classical parallel processing and vector processors, where a
  59  large array of processors do the exact same thing to many pieces of
  60  data.
  61  
  62  This model is particularly useful if the system running the program
  63  will distribute multiple threads across different processors.  It can
  64  also be useful in ray tracing or rendering engines, where the
  65  individual threads can pass on interim results to give the user visual
  66  feedback.
  67  
  68  =head2 Pipeline
  69  
  70  The pipeline model divides up a task into a series of steps, and
  71  passes the results of one step on to the thread processing the
  72  next.  Each thread does one thing to each piece of data and passes the
  73  results to the next thread in line.
  74  
  75  This model makes the most sense if you have multiple processors so two
  76  or more threads will be executing in parallel, though it can often
  77  make sense in other contexts as well.  It tends to keep the individual
  78  tasks small and simple, as well as allowing some parts of the pipeline
  79  to block (on I/O or system calls, for example) while other parts keep
  80  going.  If you're running different parts of the pipeline on different
  81  processors you may also take advantage of the caches on each
  82  processor.
  83  
  84  This model is also handy for a form of recursive programming where,
  85  rather than having a subroutine call itself, it instead creates
  86  another thread.  Prime and Fibonacci generators both map well to this
  87  form of the pipeline model. (A version of a prime number generator is
  88  presented later on.)
  89  
  90  =head1 Native threads
  91  
  92  There are several different ways to implement threads on a system.  How
  93  threads are implemented depends both on the vendor and, in some cases,
  94  the version of the operating system.  Often the first implementation
  95  will be relatively simple, but later versions of the OS will be more
  96  sophisticated.
  97  
  98  While the information in this section is useful, it's not necessary,
  99  so you can skip it if you don't feel up to it.
 100  
 101  There are three basic categories of threads-user-mode threads, kernel
 102  threads, and multiprocessor kernel threads.
 103  
 104  User-mode threads are threads that live entirely within a program and
 105  its libraries.  In this model, the OS knows nothing about threads.  As
 106  far as it's concerned, your process is just a process.
 107  
 108  This is the easiest way to implement threads, and the way most OSes
 109  start.  The big disadvantage is that, since the OS knows nothing about
 110  threads, if one thread blocks they all do.  Typical blocking activities
 111  include most system calls, most I/O, and things like sleep().
 112  
 113  Kernel threads are the next step in thread evolution.  The OS knows
 114  about kernel threads, and makes allowances for them.  The main
 115  difference between a kernel thread and a user-mode thread is
 116  blocking.  With kernel threads, things that block a single thread don't
 117  block other threads.  This is not the case with user-mode threads,
 118  where the kernel blocks at the process level and not the thread level.
 119  
 120  This is a big step forward, and can give a threaded program quite a
 121  performance boost over non-threaded programs.  Threads that block
 122  performing I/O, for example, won't block threads that are doing other
 123  things.  Each process still has only one thread running at once,
 124  though, regardless of how many CPUs a system might have.
 125  
 126  Since kernel threading can interrupt a thread at any time, they will
 127  uncover some of the implicit locking assumptions you may make in your
 128  program.  For example, something as simple as C<$a = $a + 2> can behave
 129  unpredictably with kernel threads if $a is visible to other
 130  threads, as another thread may have changed $a between the time it
 131  was fetched on the right hand side and the time the new value is
 132  stored.
 133  
 134  Multiprocessor Kernel Threads are the final step in thread
 135  support.  With multiprocessor kernel threads on a machine with multiple
 136  CPUs, the OS may schedule two or more threads to run simultaneously on
 137  different CPUs.
 138  
 139  This can give a serious performance boost to your threaded program,
 140  since more than one thread will be executing at the same time.  As a
 141  tradeoff, though, any of those nagging synchronization issues that
 142  might not have shown with basic kernel threads will appear with a
 143  vengeance.
 144  
 145  In addition to the different levels of OS involvement in threads,
 146  different OSes (and different thread implementations for a particular
 147  OS) allocate CPU cycles to threads in different ways.
 148  
 149  Cooperative multitasking systems have running threads give up control
 150  if one of two things happen.  If a thread calls a yield function, it
 151  gives up control.  It also gives up control if the thread does
 152  something that would cause it to block, such as perform I/O.  In a
 153  cooperative multitasking implementation, one thread can starve all the
 154  others for CPU time if it so chooses.
 155  
 156  Preemptive multitasking systems interrupt threads at regular intervals
 157  while the system decides which thread should run next.  In a preemptive
 158  multitasking system, one thread usually won't monopolize the CPU.
 159  
 160  On some systems, there can be cooperative and preemptive threads
 161  running simultaneously. (Threads running with realtime priorities
 162  often behave cooperatively, for example, while threads running at
 163  normal priorities behave preemptively.)
 164  
 165  =head1 What kind of threads are perl threads?
 166  
 167  If you have experience with other thread implementations, you might
 168  find that things aren't quite what you expect.  It's very important to
 169  remember when dealing with Perl threads that Perl Threads Are Not X
 170  Threads, for all values of X.  They aren't POSIX threads, or
 171  DecThreads, or Java's Green threads, or Win32 threads.  There are
 172  similarities, and the broad concepts are the same, but if you start
 173  looking for implementation details you're going to be either
 174  disappointed or confused.  Possibly both.
 175  
 176  This is not to say that Perl threads are completely different from
 177  everything that's ever come before--they're not.  Perl's threading
 178  model owes a lot to other thread models, especially POSIX.  Just as
 179  Perl is not C, though, Perl threads are not POSIX threads.  So if you
 180  find yourself looking for mutexes, or thread priorities, it's time to
 181  step back a bit and think about what you want to do and how Perl can
 182  do it.
 183  
 184  =head1 Threadsafe Modules
 185  
 186  The addition of threads has changed Perl's internals
 187  substantially.  There are implications for people who write
 188  modules--especially modules with XS code or external libraries.  While
 189  most modules won't encounter any problems, modules that aren't
 190  explicitly tagged as thread-safe should be tested before being used in
 191  production code.
 192  
 193  Not all modules that you might use are thread-safe, and you should
 194  always assume a module is unsafe unless the documentation says
 195  otherwise.  This includes modules that are distributed as part of the
 196  core.  Threads are a beta feature, and even some of the standard
 197  modules aren't thread-safe.
 198  
 199  If you're using a module that's not thread-safe for some reason, you
 200  can protect yourself by using semaphores and lots of programming
 201  discipline to control access to the module.  Semaphores are covered
 202  later in the article.  Perl Threads Are Different
 203  
 204  =head1 Thread Basics
 205  
 206  The core Thread module provides the basic functions you need to write
 207  threaded programs.  In the following sections we'll cover the basics,
 208  showing you what you need to do to create a threaded program.   After
 209  that, we'll go over some of the features of the Thread module that
 210  make threaded programming easier.
 211  
 212  =head2 Basic Thread Support
 213  
 214  Thread support is a Perl compile-time option-it's something that's
 215  turned on or off when Perl is built at your site, rather than when
 216  your programs are compiled. If your Perl wasn't compiled with thread
 217  support enabled, then any attempt to use threads will fail.
 218  
 219  Remember that the threading support in 5.005 is in beta release, and
 220  should be treated as such.   You should expect that it may not function
 221  entirely properly, and the thread interface may well change some
 222  before it is a fully supported, production release.  The beta version
 223  shouldn't be used for mission-critical projects.  Having said that,
 224  threaded Perl is pretty nifty, and worth a look.
 225  
 226  Your programs can use the Config module to check whether threads are
 227  enabled. If your program can't run without them, you can say something
 228  like:
 229  
 230    $Config{usethreads} or die "Recompile Perl with threads to run this program.";
 231  
 232  A possibly-threaded program using a possibly-threaded module might
 233  have code like this:
 234  
 235      use Config; 
 236      use MyMod; 
 237  
 238      if ($Config{usethreads}) { 
 239          # We have threads 
 240          require MyMod_threaded; 
 241          import MyMod_threaded; 
 242      } else { 
 243          require MyMod_unthreaded; 
 244          import MyMod_unthreaded; 
 245      } 
 246  
 247  Since code that runs both with and without threads is usually pretty
 248  messy, it's best to isolate the thread-specific code in its own
 249  module.  In our example above, that's what MyMod_threaded is, and it's
 250  only imported if we're running on a threaded Perl.
 251  
 252  =head2 Creating Threads
 253  
 254  The Thread package provides the tools you need to create new
 255  threads.  Like any other module, you need to tell Perl you want to use
 256  it; use Thread imports all the pieces you need to create basic
 257  threads.
 258  
 259  The simplest, straightforward way to create a thread is with new():
 260  
 261      use Thread; 
 262  
 263      $thr = Thread->new( \&sub1 );
 264  
 265      sub sub1 { 
 266          print "In the thread\n"; 
 267      }
 268  
 269  The new() method takes a reference to a subroutine and creates a new
 270  thread, which starts executing in the referenced subroutine.  Control
 271  then passes both to the subroutine and the caller.
 272  
 273  If you need to, your program can pass parameters to the subroutine as
 274  part of the thread startup.  Just include the list of parameters as
 275  part of the C<Thread::new> call, like this:
 276  
 277      use Thread; 
 278      $Param3 = "foo"; 
 279      $thr = Thread->new( \&sub1, "Param 1", "Param 2", $Param3 );
 280      $thr = Thread->new( \&sub1, @ParamList );
 281      $thr = Thread->new( \&sub1, qw(Param1 Param2 $Param3) );
 282  
 283      sub sub1 { 
 284          my @InboundParameters = @_; 
 285          print "In the thread\n"; 
 286          print "got parameters >", join("<>", @InboundParameters), "<\n"; 
 287      }
 288  
 289  
 290  The subroutine runs like a normal Perl subroutine, and the call to new
 291  Thread returns whatever the subroutine returns.
 292  
 293  The last example illustrates another feature of threads.  You can spawn
 294  off several threads using the same subroutine.  Each thread executes
 295  the same subroutine, but in a separate thread with a separate
 296  environment and potentially separate arguments.
 297  
 298  The other way to spawn a new thread is with async(), which is a way to
 299  spin off a chunk of code like eval(), but into its own thread:
 300  
 301      use Thread qw(async);
 302  
 303      $LineCount = 0; 
 304  
 305      $thr = async { 
 306          while(<>) {$LineCount++}      
 307          print "Got $LineCount lines\n";
 308      }; 
 309  
 310      print "Waiting for the linecount to end\n"; 
 311      $thr->join; 
 312      print "All done\n";
 313  
 314  You'll notice we did a use Thread qw(async) in that example.  async is
 315  not exported by default, so if you want it, you'll either need to
 316  import it before you use it or fully qualify it as
 317  Thread::async.  You'll also note that there's a semicolon after the
 318  closing brace.  That's because async() treats the following block as an
 319  anonymous subroutine, so the semicolon is necessary.
 320  
 321  Like eval(), the code executes in the same context as it would if it
 322  weren't spun off.  Since both the code inside and after the async start
 323  executing, you need to be careful with any shared resources.  Locking
 324  and other synchronization techniques are covered later.
 325  
 326  =head2 Giving up control
 327  
 328  There are times when you may find it useful to have a thread
 329  explicitly give up the CPU to another thread.  Your threading package
 330  might not support preemptive multitasking for threads, for example, or
 331  you may be doing something compute-intensive and want to make sure
 332  that the user-interface thread gets called frequently.  Regardless,
 333  there are times that you might want a thread to give up the processor.
 334  
 335  Perl's threading package provides the yield() function that does
 336  this. yield() is pretty straightforward, and works like this:
 337  
 338      use Thread qw(yield async); 
 339      async { 
 340          my $foo = 50; 
 341          while ($foo--) { print "first async\n" }
 342          yield; 
 343          $foo = 50; 
 344          while ($foo--) { print "first async\n" } 
 345      }; 
 346      async { 
 347          my $foo = 50; 
 348          while ($foo--) { print "second async\n" }
 349          yield; 
 350          $foo = 50; 
 351          while ($foo--) { print "second async\n" } 
 352      };
 353  
 354  =head2 Waiting For A Thread To Exit
 355  
 356  Since threads are also subroutines, they can return values.  To wait
 357  for a thread to exit and extract any scalars it might return, you can
 358  use the join() method.
 359  
 360      use Thread; 
 361      $thr = Thread->new( \&sub1 );
 362  
 363      @ReturnData = $thr->join; 
 364      print "Thread returned @ReturnData"; 
 365  
 366      sub sub1 { return "Fifty-six", "foo", 2; }
 367  
 368  In the example above, the join() method returns as soon as the thread
 369  ends.  In addition to waiting for a thread to finish and gathering up
 370  any values that the thread might have returned, join() also performs
 371  any OS cleanup necessary for the thread.  That cleanup might be
 372  important, especially for long-running programs that spawn lots of
 373  threads.  If you don't want the return values and don't want to wait
 374  for the thread to finish, you should call the detach() method
 375  instead. detach() is covered later in the article.
 376  
 377  =head2 Errors In Threads
 378  
 379  So what happens when an error occurs in a thread? Any errors that
 380  could be caught with eval() are postponed until the thread is
 381  joined.  If your program never joins, the errors appear when your
 382  program exits.
 383  
 384  Errors deferred until a join() can be caught with eval():
 385  
 386      use Thread qw(async); 
 387      $thr = async {$b = 3/0};   # Divide by zero error
 388      $foo = eval {$thr->join}; 
 389      if ($@) { 
 390          print "died with error $@\n"; 
 391      } else { 
 392          print "Hey, why aren't you dead?\n"; 
 393      }
 394  
 395  eval() passes any results from the joined thread back unmodified, so
 396  if you want the return value of the thread, this is your only chance
 397  to get them.
 398  
 399  =head2 Ignoring A Thread
 400  
 401  join() does three things: it waits for a thread to exit, cleans up
 402  after it, and returns any data the thread may have produced.  But what
 403  if you're not interested in the thread's return values, and you don't
 404  really care when the thread finishes? All you want is for the thread
 405  to get cleaned up after when it's done.
 406  
 407  In this case, you use the detach() method.  Once a thread is detached,
 408  it'll run until it's finished, then Perl will clean up after it
 409  automatically.
 410  
 411      use Thread; 
 412      $thr = Thread->new( \&sub1 ); # Spawn the thread
 413  
 414      $thr->detach; # Now we officially don't care any more
 415  
 416      sub sub1 { 
 417          $a = 0; 
 418          while (1) { 
 419              $a++; 
 420              print "\$a is $a\n"; 
 421              sleep 1; 
 422          } 
 423      }
 424  
 425  
 426  Once a thread is detached, it may not be joined, and any output that
 427  it might have produced (if it was done and waiting for a join) is
 428  lost.
 429  
 430  =head1 Threads And Data
 431  
 432  Now that we've covered the basics of threads, it's time for our next
 433  topic: data.  Threading introduces a couple of complications to data
 434  access that non-threaded programs never need to worry about.
 435  
 436  =head2 Shared And Unshared Data
 437  
 438  The single most important thing to remember when using threads is that
 439  all threads potentially have access to all the data anywhere in your
 440  program.  While this is true with a nonthreaded Perl program as well,
 441  it's especially important to remember with a threaded program, since
 442  more than one thread can be accessing this data at once.
 443  
 444  Perl's scoping rules don't change because you're using threads.  If a
 445  subroutine (or block, in the case of async()) could see a variable if
 446  you weren't running with threads, it can see it if you are.  This is
 447  especially important for the subroutines that create, and makes C<my>
 448  variables even more important.  Remember--if your variables aren't
 449  lexically scoped (declared with C<my>) you're probably sharing them
 450  between threads.
 451  
 452  =head2 Thread Pitfall: Races
 453  
 454  While threads bring a new set of useful tools, they also bring a
 455  number of pitfalls.  One pitfall is the race condition:
 456  
 457      use Thread; 
 458      $a = 1; 
 459      $thr1 = Thread->new(\&sub1); 
 460      $thr2 = Thread->new(\&sub2); 
 461  
 462      sleep 10; 
 463      print "$a\n";
 464  
 465      sub sub1 { $foo = $a; $a = $foo + 1; }
 466      sub sub2 { $bar = $a; $a = $bar + 1; }
 467  
 468  What do you think $a will be? The answer, unfortunately, is "it
 469  depends." Both sub1() and sub2() access the global variable $a, once
 470  to read and once to write.  Depending on factors ranging from your
 471  thread implementation's scheduling algorithm to the phase of the moon,
 472  $a can be 2 or 3.
 473  
 474  Race conditions are caused by unsynchronized access to shared
 475  data.  Without explicit synchronization, there's no way to be sure that
 476  nothing has happened to the shared data between the time you access it
 477  and the time you update it.  Even this simple code fragment has the
 478  possibility of error:
 479  
 480      use Thread qw(async); 
 481      $a = 2; 
 482      async{ $b = $a; $a = $b + 1; }; 
 483      async{ $c = $a; $a = $c + 1; };
 484  
 485  Two threads both access $a.  Each thread can potentially be interrupted
 486  at any point, or be executed in any order.  At the end, $a could be 3
 487  or 4, and both $b and $c could be 2 or 3.
 488  
 489  Whenever your program accesses data or resources that can be accessed
 490  by other threads, you must take steps to coordinate access or risk
 491  data corruption and race conditions.
 492  
 493  =head2 Controlling access: lock()
 494  
 495  The lock() function takes a variable (or subroutine, but we'll get to
 496  that later) and puts a lock on it.  No other thread may lock the
 497  variable until the locking thread exits the innermost block containing
 498  the lock.  Using lock() is straightforward:
 499  
 500      use Thread qw(async); 
 501      $a = 4; 
 502      $thr1 = async { 
 503          $foo = 12; 
 504          { 
 505              lock ($a); # Block until we get access to $a 
 506              $b = $a; 
 507              $a = $b * $foo; 
 508          } 
 509          print "\$foo was $foo\n";
 510      }; 
 511      $thr2 = async { 
 512          $bar = 7; 
 513          { 
 514              lock ($a); # Block until we can get access to $a
 515              $c = $a; 
 516              $a = $c * $bar; 
 517          } 
 518          print "\$bar was $bar\n";
 519      }; 
 520      $thr1->join; 
 521      $thr2->join; 
 522      print "\$a is $a\n";
 523  
 524  lock() blocks the thread until the variable being locked is
 525  available.  When lock() returns, your thread can be sure that no other
 526  thread can lock that variable until the innermost block containing the
 527  lock exits.
 528  
 529  It's important to note that locks don't prevent access to the variable
 530  in question, only lock attempts.  This is in keeping with Perl's
 531  longstanding tradition of courteous programming, and the advisory file
 532  locking that flock() gives you.  Locked subroutines behave differently,
 533  however.  We'll cover that later in the article.
 534  
 535  You may lock arrays and hashes as well as scalars.  Locking an array,
 536  though, will not block subsequent locks on array elements, just lock
 537  attempts on the array itself.
 538  
 539  Finally, locks are recursive, which means it's okay for a thread to
 540  lock a variable more than once.  The lock will last until the outermost
 541  lock() on the variable goes out of scope.
 542  
 543  =head2 Thread Pitfall: Deadlocks
 544  
 545  Locks are a handy tool to synchronize access to data.  Using them
 546  properly is the key to safe shared data.  Unfortunately, locks aren't
 547  without their dangers.  Consider the following code:
 548  
 549      use Thread qw(async yield); 
 550      $a = 4; 
 551      $b = "foo"; 
 552      async { 
 553          lock($a); 
 554          yield; 
 555          sleep 20; 
 556          lock ($b); 
 557      }; 
 558      async { 
 559          lock($b); 
 560          yield; 
 561          sleep 20; 
 562          lock ($a); 
 563      };
 564  
 565  This program will probably hang until you kill it.  The only way it
 566  won't hang is if one of the two async() routines acquires both locks
 567  first.  A guaranteed-to-hang version is more complicated, but the
 568  principle is the same.
 569  
 570  The first thread spawned by async() will grab a lock on $a then, a
 571  second or two later, try to grab a lock on $b.  Meanwhile, the second
 572  thread grabs a lock on $b, then later tries to grab a lock on $a.  The
 573  second lock attempt for both threads will block, each waiting for the
 574  other to release its lock.
 575  
 576  This condition is called a deadlock, and it occurs whenever two or
 577  more threads are trying to get locks on resources that the others
 578  own.  Each thread will block, waiting for the other to release a lock
 579  on a resource.  That never happens, though, since the thread with the
 580  resource is itself waiting for a lock to be released.
 581  
 582  There are a number of ways to handle this sort of problem.  The best
 583  way is to always have all threads acquire locks in the exact same
 584  order.  If, for example, you lock variables $a, $b, and $c, always lock
 585  $a before $b, and $b before $c.  It's also best to hold on to locks for
 586  as short a period of time to minimize the risks of deadlock.
 587  
 588  =head2 Queues: Passing Data Around
 589  
 590  A queue is a special thread-safe object that lets you put data in one
 591  end and take it out the other without having to worry about
 592  synchronization issues.  They're pretty straightforward, and look like
 593  this:
 594  
 595      use Thread qw(async); 
 596      use Thread::Queue;
 597  
 598      my $DataQueue = Thread::Queue->new();
 599      $thr = async { 
 600          while ($DataElement = $DataQueue->dequeue) { 
 601              print "Popped $DataElement off the queue\n";
 602          } 
 603      }; 
 604  
 605      $DataQueue->enqueue(12); 
 606      $DataQueue->enqueue("A", "B", "C"); 
 607      sleep 10; 
 608      $DataQueue->enqueue(undef);
 609  
 610  You create the queue with new Thread::Queue.  Then you can add lists of
 611  scalars onto the end with enqueue(), and pop scalars off the front of
 612  it with dequeue().  A queue has no fixed size, and can grow as needed
 613  to hold everything pushed on to it.
 614  
 615  If a queue is empty, dequeue() blocks until another thread enqueues
 616  something.  This makes queues ideal for event loops and other
 617  communications between threads.
 618  
 619  =head1 Threads And Code
 620  
 621  In addition to providing thread-safe access to data via locks and
 622  queues, threaded Perl also provides general-purpose semaphores for
 623  coarser synchronization than locks provide and thread-safe access to
 624  entire subroutines.
 625  
 626  =head2 Semaphores: Synchronizing Data Access
 627  
 628  Semaphores are a kind of generic locking mechanism.  Unlike lock, which
 629  gets a lock on a particular scalar, Perl doesn't associate any
 630  particular thing with a semaphore so you can use them to control
 631  access to anything you like.  In addition, semaphores can allow more
 632  than one thread to access a resource at once, though by default
 633  semaphores only allow one thread access at a time.
 634  
 635  =over 4
 636  
 637  =item Basic semaphores
 638  
 639  Semaphores have two methods, down and up. down decrements the resource
 640  count, while up increments it.  down calls will block if the
 641  semaphore's current count would decrement below zero.  This program
 642  gives a quick demonstration:
 643  
 644      use Thread qw(yield); 
 645      use Thread::Semaphore; 
 646      my $semaphore = Thread::Semaphore->new();
 647      $GlobalVariable = 0;
 648  
 649      $thr1 = Thread->new( \&sample_sub, 1 );
 650      $thr2 = Thread->new( \&sample_sub, 2 );
 651      $thr3 = Thread->new( \&sample_sub, 3 );
 652  
 653      sub sample_sub { 
 654          my $SubNumber = shift @_; 
 655          my $TryCount = 10; 
 656          my $LocalCopy; 
 657          sleep 1; 
 658          while ($TryCount--) { 
 659              $semaphore->down; 
 660              $LocalCopy = $GlobalVariable; 
 661              print "$TryCount tries left for sub $SubNumber (\$GlobalVariable is $GlobalVariable)\n"; 
 662              yield; 
 663              sleep 2; 
 664              $LocalCopy++; 
 665              $GlobalVariable = $LocalCopy; 
 666              $semaphore->up; 
 667          } 
 668      }
 669  
 670  The three invocations of the subroutine all operate in sync.  The
 671  semaphore, though, makes sure that only one thread is accessing the
 672  global variable at once.
 673  
 674  =item Advanced Semaphores
 675  
 676  By default, semaphores behave like locks, letting only one thread
 677  down() them at a time.  However, there are other uses for semaphores.
 678  
 679  Each semaphore has a counter attached to it. down() decrements the
 680  counter and up() increments the counter.  By default, semaphores are
 681  created with the counter set to one, down() decrements by one, and
 682  up() increments by one.  If down() attempts to decrement the counter
 683  below zero, it blocks until the counter is large enough.  Note that
 684  while a semaphore can be created with a starting count of zero, any
 685  up() or down() always changes the counter by at least
 686  one. $semaphore->down(0) is the same as $semaphore->down(1).
 687  
 688  The question, of course, is why would you do something like this? Why
 689  create a semaphore with a starting count that's not one, or why
 690  decrement/increment it by more than one? The answer is resource
 691  availability.  Many resources that you want to manage access for can be
 692  safely used by more than one thread at once.
 693  
 694  For example, let's take a GUI driven program.  It has a semaphore that
 695  it uses to synchronize access to the display, so only one thread is
 696  ever drawing at once.  Handy, but of course you don't want any thread
 697  to start drawing until things are properly set up.  In this case, you
 698  can create a semaphore with a counter set to zero, and up it when
 699  things are ready for drawing.
 700  
 701  Semaphores with counters greater than one are also useful for
 702  establishing quotas.  Say, for example, that you have a number of
 703  threads that can do I/O at once.  You don't want all the threads
 704  reading or writing at once though, since that can potentially swamp
 705  your I/O channels, or deplete your process' quota of filehandles.  You
 706  can use a semaphore initialized to the number of concurrent I/O
 707  requests (or open files) that you want at any one time, and have your
 708  threads quietly block and unblock themselves.
 709  
 710  Larger increments or decrements are handy in those cases where a
 711  thread needs to check out or return a number of resources at once.
 712  
 713  =back
 714  
 715  =head2 Attributes: Restricting Access To Subroutines
 716  
 717  In addition to synchronizing access to data or resources, you might
 718  find it useful to synchronize access to subroutines.  You may be
 719  accessing a singular machine resource (perhaps a vector processor), or
 720  find it easier to serialize calls to a particular subroutine than to
 721  have a set of locks and semaphores.
 722  
 723  One of the additions to Perl 5.005 is subroutine attributes.  The
 724  Thread package uses these to provide several flavors of
 725  serialization.  It's important to remember that these attributes are
 726  used in the compilation phase of your program so you can't change a
 727  subroutine's behavior while your program is actually running.
 728  
 729  =head2 Subroutine Locks
 730  
 731  The basic subroutine lock looks like this:
 732  
 733      sub test_sub :locked { 
 734      }
 735  
 736  This ensures that only one thread will be executing this subroutine at
 737  any one time.  Once a thread calls this subroutine, any other thread
 738  that calls it will block until the thread in the subroutine exits
 739  it.  A more elaborate example looks like this:
 740  
 741      use Thread qw(yield); 
 742  
 743      new Thread \&thread_sub, 1; 
 744      new Thread \&thread_sub, 2; 
 745      new Thread \&thread_sub, 3; 
 746      new Thread \&thread_sub, 4;
 747  
 748      sub sync_sub :locked { 
 749          my $CallingThread = shift @_; 
 750          print "In sync_sub for thread $CallingThread\n";
 751          yield; 
 752          sleep 3; 
 753          print "Leaving sync_sub for thread $CallingThread\n"; 
 754      }
 755  
 756      sub thread_sub { 
 757          my $ThreadID = shift @_; 
 758          print "Thread $ThreadID calling sync_sub\n";
 759          sync_sub($ThreadID); 
 760          print "$ThreadID is done with sync_sub\n"; 
 761      }
 762  
 763  The C<locked> attribute tells perl to lock sync_sub(), and if you run
 764  this, you can see that only one thread is in it at any one time.
 765  
 766  =head2 Methods
 767  
 768  Locking an entire subroutine can sometimes be overkill, especially
 769  when dealing with Perl objects.  When calling a method for an object,
 770  for example, you want to serialize calls to a method, so that only one
 771  thread will be in the subroutine for a particular object, but threads
 772  calling that subroutine for a different object aren't blocked.  The
 773  method attribute indicates whether the subroutine is really a method.
 774  
 775      use Thread;
 776  
 777      sub tester { 
 778          my $thrnum = shift @_; 
 779          my $bar = Foo->new();
 780          foreach (1..10) {     
 781              print "$thrnum calling per_object\n"; 
 782              $bar->per_object($thrnum);     
 783              print "$thrnum out of per_object\n"; 
 784              yield; 
 785              print "$thrnum calling one_at_a_time\n";
 786              $bar->one_at_a_time($thrnum);     
 787              print "$thrnum out of one_at_a_time\n"; 
 788              yield; 
 789          } 
 790      }
 791  
 792      foreach my $thrnum (1..10) { 
 793          new Thread \&tester, $thrnum; 
 794      }
 795  
 796      package Foo; 
 797      sub new { 
 798          my $class = shift @_; 
 799          return bless [@_], $class; 
 800      }
 801  
 802      sub per_object :locked :method { 
 803          my ($class, $thrnum) = @_; 
 804          print "In per_object for thread $thrnum\n"; 
 805          yield; 
 806          sleep 2; 
 807          print "Exiting per_object for thread $thrnum\n"; 
 808      }
 809  
 810      sub one_at_a_time :locked { 
 811          my ($class, $thrnum) = @_; 
 812          print "In one_at_a_time for thread $thrnum\n";     
 813          yield; 
 814          sleep 2; 
 815          print "Exiting one_at_a_time for thread $thrnum\n"; 
 816      }
 817  
 818  As you can see from the output (omitted for brevity; it's 800 lines)
 819  all the threads can be in per_object() simultaneously, but only one
 820  thread is ever in one_at_a_time() at once.
 821  
 822  =head2 Locking A Subroutine
 823  
 824  You can lock a subroutine as you would lock a variable.  Subroutine locks
 825  work the same as specifying a C<locked> attribute for the subroutine,
 826  and block all access to the subroutine for other threads until the
 827  lock goes out of scope.  When the subroutine isn't locked, any number
 828  of threads can be in it at once, and getting a lock on a subroutine
 829  doesn't affect threads already in the subroutine.  Getting a lock on a
 830  subroutine looks like this:
 831  
 832      lock(\&sub_to_lock);
 833  
 834  Simple enough.  Unlike the C<locked> attribute, which is a compile time
 835  option, locking and unlocking a subroutine can be done at runtime at your
 836  discretion.  There is some runtime penalty to using lock(\&sub) instead
 837  of the C<locked> attribute, so make sure you're choosing the proper
 838  method to do the locking.
 839  
 840  You'd choose lock(\&sub) when writing modules and code to run on both
 841  threaded and unthreaded Perl, especially for code that will run on
 842  5.004 or earlier Perls.  In that case, it's useful to have subroutines
 843  that should be serialized lock themselves if they're running threaded,
 844  like so:
 845  
 846      package Foo; 
 847      use Config; 
 848      $Running_Threaded = 0;
 849  
 850      BEGIN { $Running_Threaded = $Config{'usethreads'} }
 851  
 852      sub sub1 { lock(\&sub1) if $Running_Threaded }
 853  
 854  
 855  This way you can ensure single-threadedness regardless of which
 856  version of Perl you're running.
 857  
 858  =head1 General Thread Utility Routines
 859  
 860  We've covered the workhorse parts of Perl's threading package, and
 861  with these tools you should be well on your way to writing threaded
 862  code and packages.  There are a few useful little pieces that didn't
 863  really fit in anyplace else.
 864  
 865  =head2 What Thread Am I In?
 866  
 867  The Thread->self method provides your program with a way to get an
 868  object representing the thread it's currently in.  You can use this
 869  object in the same way as the ones returned from the thread creation.
 870  
 871  =head2 Thread IDs
 872  
 873  tid() is a thread object method that returns the thread ID of the
 874  thread the object represents.  Thread IDs are integers, with the main
 875  thread in a program being 0.  Currently Perl assigns a unique tid to
 876  every thread ever created in your program, assigning the first thread
 877  to be created a tid of 1, and increasing the tid by 1 for each new
 878  thread that's created.
 879  
 880  =head2 Are These Threads The Same?
 881  
 882  The equal() method takes two thread objects and returns true 
 883  if the objects represent the same thread, and false if they don't.
 884  
 885  =head2 What Threads Are Running?
 886  
 887  Thread->list returns a list of thread objects, one for each thread
 888  that's currently running.  Handy for a number of things, including
 889  cleaning up at the end of your program:
 890  
 891      # Loop through all the threads 
 892      foreach $thr (Thread->list) { 
 893          # Don't join the main thread or ourselves 
 894          if ($thr->tid && !Thread::equal($thr, Thread->self)) { 
 895              $thr->join; 
 896          } 
 897      }
 898  
 899  The example above is just for illustration.  It isn't strictly
 900  necessary to join all the threads you create, since Perl detaches all
 901  the threads before it exits.
 902  
 903  =head1 A Complete Example
 904  
 905  Confused yet? It's time for an example program to show some of the
 906  things we've covered.  This program finds prime numbers using threads.
 907  
 908      1  #!/usr/bin/perl -w
 909      2  # prime-pthread, courtesy of Tom Christiansen
 910      3
 911      4  use strict;
 912      5
 913      6  use Thread;
 914      7  use Thread::Queue;
 915      8
 916      9  my $stream = Thread::Queue->new();
 917      10 my $kid    = Thread->new(\&check_num, $stream, 2);
 918      11
 919      12 for my $i ( 3 .. 1000 ) {
 920      13     $stream->enqueue($i);
 921      14 } 
 922      15
 923      16 $stream->enqueue(undef);
 924      17 $kid->join();
 925      18
 926      19 sub check_num {
 927      20     my ($upstream, $cur_prime) = @_;
 928      21     my $kid;
 929      22     my $downstream = Thread::Queue->new();
 930      23     while (my $num = $upstream->dequeue) {
 931      24         next unless $num % $cur_prime;
 932      25         if ($kid) {
 933      26            $downstream->enqueue($num);
 934      27              } else {
 935      28            print "Found prime $num\n";
 936      29                  $kid = Thread->new(\&check_num, $downstream, $num);
 937      30         }
 938      31     } 
 939      32     $downstream->enqueue(undef) if $kid;
 940      33     $kid->join()        if $kid;
 941      34 }
 942  
 943  This program uses the pipeline model to generate prime numbers.  Each
 944  thread in the pipeline has an input queue that feeds numbers to be
 945  checked, a prime number that it's responsible for, and an output queue
 946  that it funnels numbers that have failed the check into.  If the thread
 947  has a number that's failed its check and there's no child thread, then
 948  the thread must have found a new prime number.  In that case, a new
 949  child thread is created for that prime and stuck on the end of the
 950  pipeline.
 951  
 952  This probably sounds a bit more confusing than it really is, so lets
 953  go through this program piece by piece and see what it does.  (For
 954  those of you who might be trying to remember exactly what a prime
 955  number is, it's a number that's only evenly divisible by itself and 1)
 956  
 957  The bulk of the work is done by the check_num() subroutine, which
 958  takes a reference to its input queue and a prime number that it's
 959  responsible for.  After pulling in the input queue and the prime that
 960  the subroutine's checking (line 20), we create a new queue (line 22)
 961  and reserve a scalar for the thread that we're likely to create later
 962  (line 21).
 963  
 964  The while loop from lines 23 to line 31 grabs a scalar off the input
 965  queue and checks against the prime this thread is responsible
 966  for.  Line 24 checks to see if there's a remainder when we modulo the
 967  number to be checked against our prime.  If there is one, the number
 968  must not be evenly divisible by our prime, so we need to either pass
 969  it on to the next thread if we've created one (line 26) or create a
 970  new thread if we haven't.
 971  
 972  The new thread creation is line 29.  We pass on to it a reference to
 973  the queue we've created, and the prime number we've found.
 974  
 975  Finally, once the loop terminates (because we got a 0 or undef in the
 976  queue, which serves as a note to die), we pass on the notice to our
 977  child and wait for it to exit if we've created a child (Lines 32 and
 978  37).
 979  
 980  Meanwhile, back in the main thread, we create a queue (line 9) and the
 981  initial child thread (line 10), and pre-seed it with the first prime:
 982  2.  Then we queue all the numbers from 3 to 1000 for checking (lines
 983  12-14), then queue a die notice (line 16) and wait for the first child
 984  thread to terminate (line 17).  Because a child won't die until its
 985  child has died, we know that we're done once we return from the join.
 986  
 987  That's how it works.  It's pretty simple; as with many Perl programs,
 988  the explanation is much longer than the program.
 989  
 990  =head1 Conclusion
 991  
 992  A complete thread tutorial could fill a book (and has, many times),
 993  but this should get you well on your way.  The final authority on how
 994  Perl's threads behave is the documentation bundled with the Perl
 995  distribution, but with what we've covered in this article, you should
 996  be well on your way to becoming a threaded Perl expert.
 997  
 998  =head1 Bibliography
 999  
1000  Here's a short bibliography courtesy of Jürgen Christoffel:
1001  
1002  =head2 Introductory Texts
1003  
1004  Birrell, Andrew D. An Introduction to Programming with
1005  Threads. Digital Equipment Corporation, 1989, DEC-SRC Research Report
1006  #35 online as
1007  http://www.research.digital.com/SRC/staff/birrell/bib.html (highly
1008  recommended)
1009  
1010  Robbins, Kay. A., and Steven Robbins. Practical Unix Programming: A
1011  Guide to Concurrency, Communication, and
1012  Multithreading. Prentice-Hall, 1996.
1013  
1014  Lewis, Bill, and Daniel J. Berg. Multithreaded Programming with
1015  Pthreads. Prentice Hall, 1997, ISBN 0-13-443698-9 (a well-written
1016  introduction to threads).
1017  
1018  Nelson, Greg (editor). Systems Programming with Modula-3.  Prentice
1019  Hall, 1991, ISBN 0-13-590464-1.
1020  
1021  Nichols, Bradford, Dick Buttlar, and Jacqueline Proulx Farrell.
1022  Pthreads Programming. O'Reilly & Associates, 1996, ISBN 156592-115-1
1023  (covers POSIX threads).
1024  
1025  =head2 OS-Related References
1026  
1027  Boykin, Joseph, David Kirschen, Alan Langerman, and Susan
1028  LoVerso. Programming under Mach. Addison-Wesley, 1994, ISBN
1029  0-201-52739-1.
1030  
1031  Tanenbaum, Andrew S. Distributed Operating Systems. Prentice Hall,
1032  1995, ISBN 0-13-219908-4 (great textbook).
1033  
1034  Silberschatz, Abraham, and Peter B. Galvin. Operating System Concepts,
1035  4th ed. Addison-Wesley, 1995, ISBN 0-201-59292-4
1036  
1037  =head2 Other References
1038  
1039  Arnold, Ken and James Gosling. The Java Programming Language, 2nd
1040  ed. Addison-Wesley, 1998, ISBN 0-201-31006-6.
1041  
1042  Le Sergent, T. and B. Berthomieu. "Incremental MultiThreaded Garbage
1043  Collection on Virtually Shared Memory Architectures" in Memory
1044  Management: Proc. of the International Workshop IWMM 92, St. Malo,
1045  France, September 1992, Yves Bekkers and Jacques Cohen, eds. Springer,
1046  1992, ISBN 3540-55940-X (real-life thread applications).
1047  
1048  =head1 Acknowledgements
1049  
1050  Thanks (in no particular order) to Chaim Frenkel, Steve Fink, Gurusamy
1051  Sarathy, Ilya Zakharevich, Benjamin Sugars, Jürgen Christoffel, Joshua
1052  Pritikin, and Alan Burlison, for their help in reality-checking and
1053  polishing this article.  Big thanks to Tom Christiansen for his rewrite
1054  of the prime number generator.
1055  
1056  =head1 AUTHOR
1057  
1058  Dan Sugalski E<lt>sugalskd@ous.eduE<gt>
1059  
1060  =head1 Copyrights
1061  
1062  This article originally appeared in The Perl Journal #10, and is
1063  copyright 1998 The Perl Journal. It appears courtesy of Jon Orwant and
1064  The Perl Journal.  This document may be distributed under the same terms
1065  as Perl itself.
1066  
1067  


Generated: Tue Mar 17 22:47:18 2015 Cross-referenced by PHPXref 0.7.1