All MUX controls come from table!
                            ___
                           | 1 |== 0
               H|====/=====|MUX|                  8
           16   |    8     |_0_|===o==============/==|H
    Data <==/===|    ___     B     |                 |===/=== Data
    to CPU      |   | 1 |==========o                 |   16   from mem
               L|=/=|MUX|          |              8  |
                  8 |_0_|===o======|==============/==|L
                      A     |      |   ___
                            |       ==| 1 |   ___
                            |         |MUX|==| 1 |
                       =====|=========|_0_|  |MUX|=/=|H
           16  H|=====|=====|===========D====|_0_| 8 |
    Data ===/===|     |     |    ___           E     |===/==> Data
    from CPU    |     |      ===| 1 |                |   16   to mem
                |     |   8     |MUX|========/=======|L
               L|=====o===/=====|_0_|        8  
                                  C
                 |=======/==================================> Addr
    Addr ==/=====|      15                                    to mem
          16  Lsb|---> L
    write -----------> W   To table
    byte  -----------> B
    fault <----------  F   From table
            Table  W L B | A B C D E F
                  -------+-------------
                   0 0 0 | 0 0 x x x 0  read word
                   0 0 1 | 0 1 x x x 0  read even byte
                   0 1 0 | x x x x x 1  read nonaligned word
                   0 1 1 | 1 1 x x x 0  read odd byte
                   1 0 0 | x x 0 x 0 0  write word
                   1 0 1 | x x 0 1 1 0  write even byte
                   1 1 0 | x x x x x 1  write nonaligned word
                   1 1 1 | x x 1 0 1 0  write odd byte
This leads to the following design for the chip:
     data ---------------------------------o---------
                                           |         |
     read ---------------------------------|--------/ \
              10                ___________|___    /___\
     address ==/==o====|lo     |         data  |     |
                __|__  |       |          in   |     |
     load hi --|>____| |===/===| addr          |     |
                  |    |  20   |               |     |
                   =/==|hi     |    1 meg x 1  |     |
                   10          |       RAM     |     |
                               |               |     |
     write --------------------|> strobe data  |     |
                               |__________out__|     |
                                           |         |
                                            ---------
Part B:
Given such a chip, here is an outline of supporting logic needed
for a 4-megabyte. 32-bit word, word-addressable RAM board:
            32
     data ===/================================o==========
               22                             |
     address ===/====o========================|==========
                     |                        |
     address valid o-|------------------------|----------
     read -------o-|-|------------------------|----------
     write ----o-|-|-|------------------------|----------
     busy ---o-|-|-|-|------------------------|----------
             | | | | |                        |
             | | | | |    10     ___           ==> 1 bit to each RAM chip
             | | | | |  |=/=====| 1 |   10
             | | | | |  | 10    |MUX|===/===> to all 32 RAM chips
             | | | |  --|=/=====|_0_|
             | | | |    | 2  ___  |    ___
             | | | | msb|-/-|   | o---|not|--> load hi to all 32 RAM chips
             | | | |      2 | = |-|-  |___|
             | | | | addr-/-|___| | |    
             | | | |(const)       | |  ___
             | | | |              |  -|and|    ___
             | | |  --------------o---|___|-o-|and|
             | |  --------------------------|-|___|---> read to all 32 chips
             | |                            |  ___
             | |                            o-|and|
             |  ----------------------------|-|___|---> write to all 32 chips
             |                              |
            / \-----------------------------o
           /___\                            |board select
             |      _______     ___         |
              -----| delay |---|not|--------
                   |_______|   |___|
The board select line above goes true when the address is valid and
when the top 2 bits of the address match the board's address (a 2-bit
address constant, probably coming from jumpers or switches on the board).
The board select line enables the read and write signals to the
RAM chips, and it produces a busy pulse back to the user.  The delay in
the busy pulse logic must exceed the delay inherent in the RAM chips.
The address-valid signal is also used to control the multiplexing of the
address inputs to the RAM chips.  As it goes from zero to one, the address
must already be valid, so this transition clocks the high bits of the address
into the on-chip address register, and also switches the multiplexor to
admit the low bits of the address directly from the bus.
In any case, 4-way interleaving is possible with the board design shown above because 2 bits are used for board-select. However, if interleaving is used, the board select bits should be the bottom two bits of the address, and this would make it very difficult to configure systems with less than the full memory configuration (you would need all 4 boards installed before the system was useful!). In the mid 1980's, few workstations were configured with 16 Meg of memory. Most had 2 to 8 Meg. Therefore, interleaving based on the above suggestions would be very unlikely!