[ Index ]

PHP Cross Reference of Unnamed Project

title

Body

[close]

/se3-unattended/var/se3/unattended/install/linuxaux/opt/perl/lib/5.10.0/pod/ -> perltodo.pod (source)

   1  =head1 NAME
   2  
   3  perltodo - Perl TO-DO List
   4  
   5  =head1 DESCRIPTION
   6  
   7  This is a list of wishes for Perl. The tasks we think are smaller or easier
   8  are listed first. Anyone is welcome to work on any of these, but it's a good
   9  idea to first contact I<perl5-porters@perl.org> to avoid duplication of
  10  effort. By all means contact a pumpking privately first if you prefer.
  11  
  12  Whilst patches to make the list shorter are most welcome, ideas to add to
  13  the list are also encouraged. Check the perl5-porters archives for past
  14  ideas, and any discussion about them. One set of archives may be found at:
  15  
  16      http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/
  17  
  18  What can we offer you in return? Fame, fortune, and everlasting glory? Maybe
  19  not, but if your patch is incorporated, then we'll add your name to the
  20  F<AUTHORS> file, which ships in the official distribution. How many other
  21  programming languages offer you 1 line of immortality?
  22  
  23  =head1 Tasks that only need Perl knowledge
  24  
  25  =head2 Remove duplication of test setup.
  26  
  27  Schwern notes, that there's duplication of code - lots and lots of tests have
  28  some variation on the big block of C<$Is_Foo> checks.  We can safely put this
  29  into a file, change it to build an C<%Is> hash and require it.  Maybe just put
  30  it into F<test.pl>. Throw in the handy tainting subroutines.
  31  
  32  =head2 merge common code in installperl and installman
  33  
  34  There are some common subroutines and a common C<BEGIN> block in F<installperl>
  35  and F<installman>. These should probably be merged. It would also be good to
  36  check for duplication in all the utility scripts supplied in the source
  37  tarball. It might be good to move them all to a subdirectory, but this would
  38  require careful checking to find all places that call them, and change those
  39  correctly.
  40  
  41  =head2 common test code for timed bail out
  42  
  43  Write portable self destruct code for tests to stop them burning CPU in
  44  infinite loops. This needs to avoid using alarm, as some of the tests are
  45  testing alarm/sleep or timers.
  46  
  47  =head2 POD -E<gt> HTML conversion in the core still sucks
  48  
  49  Which is crazy given just how simple POD purports to be, and how simple HTML
  50  can be. It's not actually I<as> simple as it sounds, particularly with the
  51  flexibility POD allows for C<=item>, but it would be good to improve the
  52  visual appeal of the HTML generated, and to avoid it having any validation
  53  errors. See also L</make HTML install work>, as the layout of installation tree
  54  is needed to improve the cross-linking.
  55  
  56  The addition of C<Pod::Simple> and its related modules may make this task
  57  easier to complete.
  58  
  59  =head2 merge checkpods and podchecker
  60  
  61  F<pod/checkpods.PL> (and C<make check> in the F<pod/> subdirectory)
  62  implements a very basic check for pod files, but the errors it discovers
  63  aren't found by podchecker. Add this check to podchecker, get rid of
  64  checkpods and have C<make check> use podchecker.
  65  
  66  =head2 perlmodlib.PL rewrite
  67  
  68  Currently perlmodlib.PL needs to be run from a source directory where perl
  69  has been built, or some modules won't be found, and others will be
  70  skipped. Make it run from a clean perl source tree (so it's reproducible).
  71  
  72  =head2 Parallel testing
  73  
  74  (This probably impacts much more than the core: also the Test::Harness
  75  and TAP::* modules on CPAN.)
  76  
  77  The core regression test suite is getting ever more comprehensive, which has
  78  the side effect that it takes longer to run. This isn't so good. Investigate
  79  whether it would be feasible to give the harness script the B<option> of
  80  running sets of tests in parallel. This would be useful for tests in
  81  F<t/op/*.t> and F<t/uni/*.t> and maybe some sets of tests in F<lib/>.
  82  
  83  Questions to answer
  84  
  85  =over 4
  86  
  87  =item 1
  88  
  89  How does screen layout work when you're running more than one test?
  90  
  91  =item 2
  92  
  93  How does the caller of test specify how many tests to run in parallel?
  94  
  95  =item 3
  96  
  97  How do setup/teardown tests identify themselves?
  98  
  99  =back
 100  
 101  Pugs already does parallel testing - can their approach be re-used?
 102  
 103  =head2 Make Schwern poorer
 104  
 105  We should have tests for everything. When all the core's modules are tested,
 106  Schwern has promised to donate to $500 to TPF. We may need volunteers to
 107  hold him upside down and shake vigorously in order to actually extract the
 108  cash.
 109  
 110  =head2 Improve the coverage of the core tests
 111  
 112  Use Devel::Cover to ascertain the core modules's test coverage, then add
 113  tests that are currently missing.
 114  
 115  =head2 test B
 116  
 117  A full test suite for the B module would be nice.
 118  
 119  =head2 Deparse inlined constants
 120  
 121  Code such as this
 122  
 123      use constant PI => 4;
 124      warn PI
 125  
 126  will currently deparse as
 127  
 128      use constant ('PI', 4);
 129      warn 4;
 130  
 131  because the tokenizer inlines the value of the constant subroutine C<PI>.
 132  This allows various compile time optimisations, such as constant folding
 133  and dead code elimination. Where these haven't happened (such as the example
 134  above) it ought be possible to make B::Deparse work out the name of the
 135  original constant, because just enough information survives in the symbol
 136  table to do this. Specifically, the same scalar is used for the constant in
 137  the optree as is used for the constant subroutine, so by iterating over all
 138  symbol tables and generating a mapping of SV address to constant name, it
 139  would be possible to provide B::Deparse with this functionality.
 140  
 141  =head2 A decent benchmark
 142  
 143  C<perlbench> seems impervious to any recent changes made to the perl core. It
 144  would be useful to have a reasonable general benchmarking suite that roughly
 145  represented what current perl programs do, and measurably reported whether
 146  tweaks to the core improve, degrade or don't really affect performance, to
 147  guide people attempting to optimise the guts of perl. Gisle would welcome
 148  new tests for perlbench.
 149  
 150  =head2 fix tainting bugs
 151  
 152  Fix the bugs revealed by running the test suite with the C<-t> switch (via
 153  C<make test.taintwarn>).
 154  
 155  =head2 Dual life everything
 156  
 157  As part of the "dists" plan, anything that doesn't belong in the smallest perl
 158  distribution needs to be dual lifed. Anything else can be too. Figure out what
 159  changes would be needed to package that module and its tests up for CPAN, and
 160  do so. Test it with older perl releases, and fix the problems you find.
 161  
 162  To make a minimal perl distribution, it's useful to look at
 163  F<t/lib/commonsense.t>.
 164  
 165  =head2 Improving C<threads::shared>
 166  
 167  Investigate whether C<threads::shared> could share aggregates properly with
 168  only Perl level changes to shared.pm
 169  
 170  =head2 POSIX memory footprint
 171  
 172  Ilya observed that use POSIX; eats memory like there's no tomorrow, and at
 173  various times worked to cut it down. There is probably still fat to cut out -
 174  for example POSIX passes Exporter some very memory hungry data structures.
 175  
 176  =head2 embed.pl/makedef.pl
 177  
 178  There is a script F<embed.pl> that generates several header files to prefix
 179  all of Perl's symbols in a consistent way, to provide some semblance of
 180  namespace support in C<C>. Functions are declared in F<embed.fnc>, variables
 181  in F<interpvar.h>. Quite a few of the functions and variables
 182  are conditionally declared there, using C<#ifdef>. However, F<embed.pl>
 183  doesn't understand the C macros, so the rules about which symbols are present
 184  when is duplicated in F<makedef.pl>. Writing things twice is bad, m'kay.
 185  It would be good to teach C<embed.pl> to understand the conditional
 186  compilation, and hence remove the duplication, and the mistakes it has caused.
 187  
 188  =head2 use strict; and AutoLoad
 189  
 190  Currently if you write
 191  
 192      package Whack;
 193      use AutoLoader 'AUTOLOAD';
 194      use strict;
 195      1;
 196      __END__
 197      sub bloop {
 198          print join (' ', No, strict, here), "!\n";
 199      }
 200  
 201  then C<use strict;> isn't in force within the autoloaded subroutines. It would
 202  be more consistent (and less surprising) to arrange for all lexical pragmas
 203  in force at the __END__ block to be in force within each autoloaded subroutine.
 204  
 205  There's a similar problem with SelfLoader.
 206  
 207  =head1 Tasks that need a little sysadmin-type knowledge
 208  
 209  Or if you prefer, tasks that you would learn from, and broaden your skills
 210  base...
 211  
 212  =head2 make HTML install work
 213  
 214  There is an C<installhtml> target in the Makefile. It's marked as
 215  "experimental". It would be good to get this tested, make it work reliably, and
 216  remove the "experimental" tag. This would include
 217  
 218  =over 4
 219  
 220  =item 1
 221  
 222  Checking that cross linking between various parts of the documentation works.
 223  In particular that links work between the modules (files with POD in F<lib/>)
 224  and the core documentation (files in F<pod/>)
 225  
 226  =item 2
 227  
 228  Work out how to split C<perlfunc> into chunks, preferably one per function
 229  group, preferably with general case code that could be used elsewhere.
 230  Challenges here are correctly identifying the groups of functions that go
 231  together, and making the right named external cross-links point to the right
 232  page. Things to be aware of are C<-X>, groups such as C<getpwnam> to
 233  C<endservent>, two or more C<=items> giving the different parameter lists, such
 234  as
 235  
 236      =item substr EXPR,OFFSET,LENGTH,REPLACEMENT
 237      =item substr EXPR,OFFSET,LENGTH
 238      =item substr EXPR,OFFSET
 239  
 240  and different parameter lists having different meanings. (eg C<select>)
 241  
 242  =back
 243  
 244  =head2 compressed man pages
 245  
 246  Be able to install them. This would probably need a configure test to see how
 247  the system does compressed man pages (same directory/different directory?
 248  same filename/different filename), as well as tweaking the F<installman> script
 249  to compress as necessary.
 250  
 251  =head2 Add a code coverage target to the Makefile
 252  
 253  Make it easy for anyone to run Devel::Cover on the core's tests. The steps
 254  to do this manually are roughly
 255  
 256  =over 4
 257  
 258  =item *
 259  
 260  do a normal C<Configure>, but include Devel::Cover as a module to install
 261  (see F<INSTALL> for how to do this)
 262  
 263  =item *
 264  
 265      make perl
 266  
 267  =item *
 268  
 269      cd t; HARNESS_PERL_SWITCHES=-MDevel::Cover ./perl -I../lib harness
 270  
 271  =item *
 272  
 273  Process the resulting Devel::Cover database
 274  
 275  =back
 276  
 277  This just give you the coverage of the F<.pm>s. To also get the C level
 278  coverage you need to
 279  
 280  =over 4
 281  
 282  =item *
 283  
 284  Additionally tell C<Configure> to use the appropriate C compiler flags for
 285  C<gcov>
 286  
 287  =item *
 288  
 289      make perl.gcov
 290  
 291  (instead of C<make perl>)
 292  
 293  =item *
 294  
 295  After running the tests run C<gcov> to generate all the F<.gcov> files.
 296  (Including down in the subdirectories of F<ext/>
 297  
 298  =item *
 299  
 300  (From the top level perl directory) run C<gcov2perl> on all the C<.gcov> files
 301  to get their stats into the cover_db directory.
 302  
 303  =item *
 304  
 305  Then process the Devel::Cover database
 306  
 307  =back
 308  
 309  It would be good to add a single switch to C<Configure> to specify that you
 310  wanted to perform perl level coverage, and another to specify C level
 311  coverage, and have C<Configure> and the F<Makefile> do all the right things
 312  automatically.
 313  
 314  =head2 Make Config.pm cope with differences between built and installed perl
 315  
 316  Quite often vendors ship a perl binary compiled with their (pay-for)
 317  compilers.  People install a free compiler, such as gcc. To work out how to
 318  build extensions, Perl interrogates C<%Config>, so in this situation
 319  C<%Config> describes compilers that aren't there, and extension building
 320  fails. This forces people into choosing between re-compiling perl themselves
 321  using the compiler they have, or only using modules that the vendor ships.
 322  
 323  It would be good to find a way teach C<Config.pm> about the installation setup,
 324  possibly involving probing at install time or later, so that the C<%Config> in
 325  a binary distribution better describes the installed machine, when the
 326  installed machine differs from the build machine in some significant way.
 327  
 328  =head2 linker specification files
 329  
 330  Some platforms mandate that you provide a list of a shared library's external
 331  symbols to the linker, so the core already has the infrastructure in place to
 332  do this for generating shared perl libraries. My understanding is that the
 333  GNU toolchain can accept an optional linker specification file, and restrict
 334  visibility just to symbols declared in that file. It would be good to extend
 335  F<makedef.pl> to support this format, and to provide a means within
 336  C<Configure> to enable it. This would allow Unix users to test that the
 337  export list is correct, and to build a perl that does not pollute the global
 338  namespace with private symbols.
 339  
 340  =head2 Cross-compile support
 341  
 342  Currently C<Configure> understands C<-Dusecrosscompile> option. This option
 343  arranges for building C<miniperl> for TARGET machine, so this C<miniperl> is
 344  assumed then to be copied to TARGET machine and used as a replacement of full
 345  C<perl> executable.
 346  
 347  This could be done little differently. Namely C<miniperl> should be built for
 348  HOST and then full C<perl> with extensions should be compiled for TARGET.
 349  This, however, might require extra trickery for %Config: we have one config
 350  first for HOST and then another for TARGET.  Tools like MakeMaker will be
 351  mightily confused.  Having around two different types of executables and
 352  libraries (HOST and TARGET) makes life interesting for Makefiles and
 353  shell (and Perl) scripts.  There is $Config{run}, normally empty, which
 354  can be used as an execution wrapper.  Also note that in some
 355  cross-compilation/execution environments the HOST and the TARGET do
 356  not see the same filesystem(s), the $Config{run} may need to do some
 357  file/directory copying back and forth.
 358  
 359  =head2 roffitall
 360  
 361  Make F<pod/roffitall> be updated by F<pod/buildtoc>.
 362  
 363  =head1 Tasks that need a little C knowledge
 364  
 365  These tasks would need a little C knowledge, but don't need any specific
 366  background or experience with XS, or how the Perl interpreter works
 367  
 368  =head2 Exterminate PL_na!
 369  
 370  C<PL_na> festers still in the darkest corners of various typemap files.
 371  It needs to be exterminated, replaced by a local variable of type C<STRLEN>.
 372  
 373  =head2 Modernize the order of directories in @INC
 374  
 375  The way @INC is laid out by default, one cannot upgrade core (dual-life)
 376  modules without overwriting files. This causes problems for binary
 377  package builders.  One possible proposal is laid out in this
 378  message:
 379  L<http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2002-04/msg02380.html>.
 380  
 381  =head2 -Duse32bit*
 382  
 383  Natively 64-bit systems need neither -Duse64bitint nor -Duse64bitall.
 384  On these systems, it might be the default compilation mode, and there
 385  is currently no guarantee that passing no use64bitall option to the
 386  Configure process will build a 32bit perl. Implementing -Duse32bit*
 387  options would be nice for perl 5.12.
 388  
 389  =head2 Make it clear from -v if this is the exact official release
 390  
 391  Currently perl from C<p4>/C<rsync> ships with a F<patchlevel.h> file that
 392  usually defines one local patch, of the form "MAINT12345" or "RC1". The output
 393  of perl -v doesn't report that a perl isn't an official release, and this
 394  information can get lost in bugs reports. Because of this, the minor version
 395  isn't bumped up until RC time, to minimise the possibility of versions of perl
 396  escaping that believe themselves to be newer than they actually are.
 397  
 398  It would be useful to find an elegant way to have the "this is an interim
 399  maintenance release" or "this is a release candidate" in the terse -v output,
 400  and have it so that it's easy for the pumpking to remove this just as the
 401  release tarball is rolled up. This way the version pulled out of rsync would
 402  always say "I'm a development release" and it would be safe to bump the
 403  reported minor version as soon as a release ships, which would aid perl
 404  developers.
 405  
 406  This task is really about thinking of an elegant way to arrange the C source
 407  such that it's trivial for the Pumpking to flag "this is an official release"
 408  when making a tarball, yet leave the default source saying "I'm not the
 409  official release".
 410  
 411  =head2 Profile Perl - am I hot or not?
 412  
 413  The Perl source code is stable enough that it makes sense to profile it,
 414  identify and optimise the hotspots. It would be good to measure the
 415  performance of the Perl interpreter using free tools such as cachegrind,
 416  gprof, and dtrace, and work to reduce the bottlenecks they reveal.
 417  
 418  As part of this, the idea of F<pp_hot.c> is that it contains the I<hot> ops,
 419  the ops that are most commonly used. The idea is that by grouping them, their
 420  object code will be adjacent in the executable, so they have a greater chance
 421  of already being in the CPU cache (or swapped in) due to being near another op
 422  already in use.
 423  
 424  Except that it's not clear if these really are the most commonly used ops. So
 425  as part of exercising your skills with coverage and profiling tools you might
 426  want to determine what ops I<really> are the most commonly used. And in turn
 427  suggest evictions and promotions to achieve a better F<pp_hot.c>.
 428  
 429  =head2 Allocate OPs from arenas
 430  
 431  Currently all new OP structures are individually malloc()ed and free()d.
 432  All C<malloc> implementations have space overheads, and are now as fast as
 433  custom allocates so it would both use less memory and less CPU to allocate
 434  the various OP structures from arenas. The SV arena code can probably be
 435  re-used for this.
 436  
 437  Note that Configuring perl with C<-Accflags=-DPL_OP_SLAB_ALLOC> will use
 438  Perl_Slab_alloc() to pack optrees into a contiguous block, which is
 439  probably superior to the use of OP arenas, esp. from a cache locality
 440  standpoint.  See L<Profile Perl - am I hot or not?>.
 441  
 442  =head2 Improve win32/wince.c
 443  
 444  Currently, numerous functions look virtually, if not completely,
 445  identical in both C<win32/wince.c> and C<win32/win32.c> files, which can't
 446  be good.
 447  
 448  =head2 Use secure CRT functions when building with VC8 on Win32
 449  
 450  Visual C++ 2005 (VC++ 8.x) deprecated a number of CRT functions on the basis
 451  that they were "unsafe" and introduced differently named secure versions of
 452  them as replacements, e.g. instead of writing
 453  
 454      FILE* f = fopen(__FILE__, "r");
 455  
 456  one should now write
 457  
 458      FILE* f;
 459      errno_t err = fopen_s(&f, __FILE__, "r"); 
 460  
 461  Currently, the warnings about these deprecations have been disabled by adding
 462  -D_CRT_SECURE_NO_DEPRECATE to the CFLAGS. It would be nice to remove that
 463  warning suppressant and actually make use of the new secure CRT functions.
 464  
 465  There is also a similar issue with POSIX CRT function names like fileno having
 466  been deprecated in favour of ISO C++ conformant names like _fileno. These
 467  warnings are also currently suppressed by adding -D_CRT_NONSTDC_NO_DEPRECATE. It
 468  might be nice to do as Microsoft suggest here too, although, unlike the secure
 469  functions issue, there is presumably little or no benefit in this case.
 470  
 471  =head2 strcat(), strcpy(), strncat(), strncpy(), sprintf(), vsprintf()
 472  
 473  Maybe create a utility that checks after each libperl.a creation that
 474  none of the above (nor sprintf(), vsprintf(), or *SHUDDER* gets())
 475  ever creep back to libperl.a.
 476  
 477    nm libperl.a | ./miniperl -alne '$o = $F[0] if /:$/; print "$o $F[1]" if $F[0] eq "U" && $F[1] =~ /^(?:strn?c(?:at|py)|v?sprintf|gets)$/'
 478  
 479  Note, of course, that this will only tell whether B<your> platform
 480  is using those naughty interfaces.
 481  
 482  =head2 -D_FORTIFY_SOURCE=2, -fstack-protector
 483  
 484  Recent glibcs support C<-D_FORTIFY_SOURCE=2> and recent gcc
 485  (4.1 onwards?) supports C<-fstack-protector>, both of which give
 486  protection against various kinds of buffer overflow problems.
 487  These should probably be used for compiling Perl whenever available,
 488  Configure and/or hints files should be adjusted to probe for the
 489  availability of these features and enable them as appropriate.
 490  
 491  =head1 Tasks that need a knowledge of XS
 492  
 493  These tasks would need C knowledge, and roughly the level of knowledge of
 494  the perl API that comes from writing modules that use XS to interface to
 495  C.
 496  
 497  =head2 autovivification
 498  
 499  Make all autovivification consistent w.r.t LVALUE/RVALUE and strict/no strict;
 500  
 501  This task is incremental - even a little bit of work on it will help.
 502  
 503  =head2 Unicode in Filenames
 504  
 505  chdir, chmod, chown, chroot, exec, glob, link, lstat, mkdir, open,
 506  opendir, qx, readdir, readlink, rename, rmdir, stat, symlink, sysopen,
 507  system, truncate, unlink, utime, -X.  All these could potentially accept
 508  Unicode filenames either as input or output (and in the case of system
 509  and qx Unicode in general, as input or output to/from the shell).
 510  Whether a filesystem - an operating system pair understands Unicode in
 511  filenames varies.
 512  
 513  Known combinations that have some level of understanding include
 514  Microsoft NTFS, Apple HFS+ (In Mac OS 9 and X) and Apple UFS (in Mac
 515  OS X), NFS v4 is rumored to be Unicode, and of course Plan 9.  How to
 516  create Unicode filenames, what forms of Unicode are accepted and used
 517  (UCS-2, UTF-16, UTF-8), what (if any) is the normalization form used,
 518  and so on, varies.  Finding the right level of interfacing to Perl
 519  requires some thought.  Remember that an OS does not implicate a
 520  filesystem.
 521  
 522  (The Windows -C command flag "wide API support" has been at least
 523  temporarily retired in 5.8.1, and the -C has been repurposed, see
 524  L<perlrun>.)
 525  
 526  Most probably the right way to do this would be this:
 527  L</"Virtualize operating system access">.
 528  
 529  =head2 Unicode in %ENV
 530  
 531  Currently the %ENV entries are always byte strings.
 532  See L</"Virtualize operating system access">.
 533  
 534  =head2 Unicode and glob()
 535  
 536  Currently glob patterns and filenames returned from File::Glob::glob()
 537  are always byte strings.  See L</"Virtualize operating system access">.
 538  
 539  =head2 Unicode and lc/uc operators
 540  
 541  Some built-in operators (C<lc>, C<uc>, etc.) behave differently, based on
 542  what the internal encoding of their argument is. That should not be the
 543  case. Maybe add a pragma to switch behaviour.
 544  
 545  =head2 use less 'memory'
 546  
 547  Investigate trade offs to switch out perl's choices on memory usage.
 548  Particularly perl should be able to give memory back.
 549  
 550  This task is incremental - even a little bit of work on it will help.
 551  
 552  =head2 Re-implement C<:unique> in a way that is actually thread-safe
 553  
 554  The old implementation made bad assumptions on several levels. A good 90%
 555  solution might be just to make C<:unique> work to share the string buffer
 556  of SvPVs. That way large constant strings can be shared between ithreads,
 557  such as the configuration information in F<Config>.
 558  
 559  =head2 Make tainting consistent
 560  
 561  Tainting would be easier to use if it didn't take documented shortcuts and
 562  allow taint to "leak" everywhere within an expression.
 563  
 564  =head2 readpipe(LIST)
 565  
 566  system() accepts a LIST syntax (and a PROGRAM LIST syntax) to avoid
 567  running a shell. readpipe() (the function behind qx//) could be similarly
 568  extended.
 569  
 570  =head2 Audit the code for destruction ordering assumptions
 571  
 572  Change 25773 notes
 573  
 574      /* Need to check SvMAGICAL, as during global destruction it may be that
 575         AvARYLEN(av) has been freed before av, and hence the SvANY() pointer
 576         is now part of the linked list of SV heads, rather than pointing to
 577         the original body.  */
 578      /* FIXME - audit the code for other bugs like this one.  */
 579  
 580  adding the C<SvMAGICAL> check to
 581  
 582      if (AvARYLEN(av) && SvMAGICAL(AvARYLEN(av))) {
 583          MAGIC *mg = mg_find (AvARYLEN(av), PERL_MAGIC_arylen);
 584  
 585  Go through the core and look for similar assumptions that SVs have particular
 586  types, as all bets are off during global destruction.
 587  
 588  =head2 Extend PerlIO and PerlIO::Scalar
 589  
 590  PerlIO::Scalar doesn't know how to truncate().  Implementing this
 591  would require extending the PerlIO vtable.
 592  
 593  Similarly the PerlIO vtable doesn't know about formats (write()), or
 594  about stat(), or chmod()/chown(), utime(), or flock().
 595  
 596  (For PerlIO::Scalar it's hard to see what e.g. mode bits or ownership
 597  would mean.)
 598  
 599  PerlIO doesn't do directories or symlinks, either: mkdir(), rmdir(),
 600  opendir(), closedir(), seekdir(), rewinddir(), glob(); symlink(),
 601  readlink().
 602  
 603  See also L</"Virtualize operating system access">.
 604  
 605  =head2 -C on the #! line
 606  
 607  It should be possible to make -C work correctly if found on the #! line,
 608  given that all perl command line options are strict ASCII, and -C changes
 609  only the interpretation of non-ASCII characters, and not for the script file
 610  handle. To make it work needs some investigation of the ordering of function
 611  calls during startup, and (by implication) a bit of tweaking of that order.
 612  
 613  =head2 Propagate const outwards from Perl_moreswitches()
 614  
 615  Change 32057 changed the parameter and return value of C<Perl_moreswitches()>
 616  from <char *> to <const char *>. It should now be possible to propagate
 617  const-correctness outwards to C<S_parse_body()>, C<Perl_moreswitches()>
 618  and C<Perl_yylex()>.
 619  
 620  =head2 Duplicate logic in S_method_common() and Perl_gv_fetchmethod_autoload()
 621  
 622  A comment in C<S_method_common> notes
 623  
 624      /* This code tries to figure out just what went wrong with
 625         gv_fetchmethod.  It therefore needs to duplicate a lot of
 626         the internals of that function.  We can't move it inside
 627         Perl_gv_fetchmethod_autoload(), however, since that would
 628         cause UNIVERSAL->can("NoSuchPackage::foo") to croak, and we
 629         don't want that.
 630      */
 631  
 632  If C<Perl_gv_fetchmethod_autoload> gets rewritten to take (more) flag bits,
 633  then it ought to be possible to move the logic from C<S_method_common> to
 634  the "right" place. When making this change it would probably be good to also
 635  pass in at least the method name length, if not also pre-computed hash values
 636  when known. (I'm contemplating a plan to pre-compute hash values for common
 637  fixed strings such as C<ISA> and pass them in to functions.)
 638  
 639  =head2 Organize error messages
 640  
 641  Perl's diagnostics (error messages, see L<perldiag>) could use
 642  reorganizing and formalizing so that each error message has its
 643  stable-for-all-eternity unique id, categorized by severity, type, and
 644  subsystem.  (The error messages would be listed in a datafile outside
 645  of the Perl source code, and the source code would only refer to the
 646  messages by the id.)  This clean-up and regularizing should apply
 647  for all croak() messages.
 648  
 649  This would enable all sorts of things: easier translation/localization
 650  of the messages (though please do keep in mind the caveats of
 651  L<Locale::Maketext> about too straightforward approaches to
 652  translation), filtering by severity, and instead of grepping for a
 653  particular error message one could look for a stable error id.  (Of
 654  course, changing the error messages by default would break all the
 655  existing software depending on some particular error message...)
 656  
 657  This kind of functionality is known as I<message catalogs>.  Look for
 658  inspiration for example in the catgets() system, possibly even use it
 659  if available-- but B<only> if available, all platforms will B<not>
 660  have catgets().
 661  
 662  For the really pure at heart, consider extending this item to cover
 663  also the warning messages (see L<perllexwarn>, C<warnings.pl>).
 664  
 665  =head1 Tasks that need a knowledge of the interpreter
 666  
 667  These tasks would need C knowledge, and knowledge of how the interpreter works,
 668  or a willingness to learn.
 669  
 670  =head2 UTF-8 revamp
 671  
 672  The handling of Unicode is unclean in many places. For example, the regexp
 673  engine matches in Unicode semantics whenever the string or the pattern is
 674  flagged as UTF-8, but that should not be dependent on an internal storage
 675  detail of the string. Likewise, case folding behaviour is dependent on the
 676  UTF8 internal flag being on or off.
 677  
 678  =head2 Properly Unicode safe tokeniser and pads.
 679  
 680  The tokeniser isn't actually very UTF-8 clean. C<use utf8;> is a hack -
 681  variable names are stored in stashes as raw bytes, without the utf-8 flag
 682  set. The pad API only takes a C<char *> pointer, so that's all bytes too. The
 683  tokeniser ignores the UTF-8-ness of C<PL_rsfp>, or any SVs returned from
 684  source filters.  All this could be fixed.
 685  
 686  =head2 state variable initialization in list context
 687  
 688  Currently this is illegal:
 689  
 690      state ($a, $b) = foo(); 
 691  
 692  In Perl 6, C<state ($a) = foo();> and C<(state $a) = foo();> have different
 693  semantics, which is tricky to implement in Perl 5 as currently they produce
 694  the same opcode trees. The Perl 6 design is firm, so it would be good to
 695  implement the necessary code in Perl 5. There are comments in
 696  C<Perl_newASSIGNOP()> that show the code paths taken by various assignment
 697  constructions involving state variables.
 698  
 699  =head2 Implement $value ~~ 0 .. $range
 700  
 701  It would be nice to extend the syntax of the C<~~> operator to also
 702  understand numeric (and maybe alphanumeric) ranges.
 703  
 704  =head2 A does() built-in
 705  
 706  Like ref(), only useful. It would call the C<DOES> method on objects; it
 707  would also tell whether something can be dereferenced as an
 708  array/hash/etc., or used as a regexp, etc.
 709  L<http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-03/msg00481.html>
 710  
 711  =head2 Tied filehandles and write() don't mix
 712  
 713  There is no method on tied filehandles to allow them to be called back by
 714  formats.
 715  
 716  =head2 Attach/detach debugger from running program
 717  
 718  The old perltodo notes "With C<gdb>, you can attach the debugger to a running
 719  program if you pass the process ID. It would be good to do this with the Perl
 720  debugger on a running Perl program, although I'm not sure how it would be
 721  done." ssh and screen do this with named pipes in /tmp. Maybe we can too.
 722  
 723  =head2 Optimize away empty destructors
 724  
 725  Defining an empty DESTROY method might be useful (notably in
 726  AUTOLOAD-enabled classes), but it's still a bit expensive to call. That
 727  could probably be optimized.
 728  
 729  =head2 LVALUE functions for lists
 730  
 731  The old perltodo notes that lvalue functions don't work for list or hash
 732  slices. This would be good to fix.
 733  
 734  =head2 LVALUE functions in the debugger
 735  
 736  The old perltodo notes that lvalue functions don't work in the debugger. This
 737  would be good to fix.
 738  
 739  =head2 regexp optimiser optional
 740  
 741  The regexp optimiser is not optional. It should configurable to be, to allow
 742  its performance to be measured, and its bugs to be easily demonstrated.
 743  
 744  =head2 delete &function
 745  
 746  Allow to delete functions. One can already undef them, but they're still
 747  in the stash.
 748  
 749  =head2 C</w> regex modifier
 750  
 751  That flag would enable to match whole words, and also to interpolate
 752  arrays as alternations. With it, C</P/w> would be roughly equivalent to:
 753  
 754      do { local $"='|'; /\b(?:P)\b/ }
 755  
 756  See L<http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-01/msg00400.html>
 757  for the discussion.
 758  
 759  =head2 optional optimizer
 760  
 761  Make the peephole optimizer optional. Currently it performs two tasks as
 762  it walks the optree - genuine peephole optimisations, and necessary fixups of
 763  ops. It would be good to find an efficient way to switch out the
 764  optimisations whilst keeping the fixups.
 765  
 766  =head2 You WANT *how* many
 767  
 768  Currently contexts are void, scalar and list. split has a special mechanism in
 769  place to pass in the number of return values wanted. It would be useful to
 770  have a general mechanism for this, backwards compatible and little speed hit.
 771  This would allow proposals such as short circuiting sort to be implemented
 772  as a module on CPAN.
 773  
 774  =head2 lexical aliases
 775  
 776  Allow lexical aliases (maybe via the syntax C<my \$alias = \$foo>.
 777  
 778  =head2 entersub XS vs Perl
 779  
 780  At the moment pp_entersub is huge, and has code to deal with entering both
 781  perl and XS subroutines. Subroutine implementations rarely change between 
 782  perl and XS at run time, so investigate using 2 ops to enter subs (one for
 783  XS, one for perl) and swap between if a sub is redefined.
 784  
 785  =head2 Self-ties
 786  
 787  Self-ties are currently illegal because they caused too many segfaults. Maybe
 788  the causes of these could be tracked down and self-ties on all types
 789  reinstated.
 790  
 791  =head2 Optimize away @_
 792  
 793  The old perltodo notes "Look at the "reification" code in C<av.c>".
 794  
 795  =head2 The yada yada yada operators
 796  
 797  Perl 6's Synopsis 3 says:
 798  
 799  I<The ... operator is the "yada, yada, yada" list operator, which is used as
 800  the body in function prototypes. It complains bitterly (by calling fail)
 801  if it is ever executed. Variant ??? calls warn, and !!! calls die.>
 802  
 803  Those would be nice to add to Perl 5. That could be done without new ops.
 804  
 805  =head2 Virtualize operating system access
 806  
 807  Implement a set of "vtables" that virtualizes operating system access
 808  (open(), mkdir(), unlink(), readdir(), getenv(), etc.)  At the very
 809  least these interfaces should take SVs as "name" arguments instead of
 810  bare char pointers; probably the most flexible and extensible way
 811  would be for the Perl-facing interfaces to accept HVs.  The system
 812  needs to be per-operating-system and per-file-system
 813  hookable/filterable, preferably both from XS and Perl level
 814  (L<perlport/"Files and Filesystems"> is good reading at this point,
 815  in fact, all of L<perlport> is.)
 816  
 817  This has actually already been implemented (but only for Win32),
 818  take a look at F<iperlsys.h> and F<win32/perlhost.h>.  While all Win32
 819  variants go through a set of "vtables" for operating system access,
 820  non-Win32 systems currently go straight for the POSIX/UNIX-style
 821  system/library call.  Similar system as for Win32 should be
 822  implemented for all platforms.  The existing Win32 implementation
 823  probably does not need to survive alongside this proposed new
 824  implementation, the approaches could be merged.
 825  
 826  What would this give us?  One often-asked-for feature this would
 827  enable is using Unicode for filenames, and other "names" like %ENV,
 828  usernames, hostnames, and so forth.
 829  (See L<perlunicode/"When Unicode Does Not Happen">.)
 830  
 831  But this kind of virtualization would also allow for things like
 832  virtual filesystems, virtual networks, and "sandboxes" (though as long
 833  as dynamic loading of random object code is allowed, not very safe
 834  sandboxes since external code of course know not of Perl's vtables).
 835  An example of a smaller "sandbox" is that this feature can be used to
 836  implement per-thread working directories: Win32 already does this.
 837  
 838  See also L</"Extend PerlIO and PerlIO::Scalar">.
 839  
 840  =head2 Investigate PADTMP hash pessimisation
 841  
 842  The peephole optimier converts constants used for hash key lookups to shared
 843  hash key scalars. Under ithreads, something is undoing this work. See
 844  See http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-09/msg00793.html
 845  
 846  =head1 Big projects
 847  
 848  Tasks that will get your name mentioned in the description of the "Highlights
 849  of 5.12"
 850  
 851  =head2 make ithreads more robust
 852  
 853  Generally make ithreads more robust. See also L</iCOW>
 854  
 855  This task is incremental - even a little bit of work on it will help, and
 856  will be greatly appreciated.
 857  
 858  One bit would be to write the missing code in sv.c:Perl_dirp_dup.
 859  
 860  Fix Perl_sv_dup, et al so that threads can return objects.
 861  
 862  =head2 iCOW
 863  
 864  Sarathy and Arthur have a proposal for an improved Copy On Write which
 865  specifically will be able to COW new ithreads. If this can be implemented
 866  it would be a good thing.
 867  
 868  =head2 (?{...}) closures in regexps
 869  
 870  Fix (or rewrite) the implementation of the C</(?{...})/> closures.
 871  
 872  =head2 A re-entrant regexp engine
 873  
 874  This will allow the use of a regex from inside (?{ }), (??{ }) and
 875  (?(?{ })|) constructs.
 876  
 877  =head2 Add class set operations to regexp engine
 878  
 879  Apparently these are quite useful. Anyway, Jeffery Friedl wants them.
 880  
 881  demerphq has this on his todo list, but right at the bottom.  


Generated: Tue Mar 17 22:47:18 2015 Cross-referenced by PHPXref 0.7.1