Harvard Architecure

#3628

Author: kruger@16bits.de
Date: Tue, 01 Mar 1988 22:07

11 lines
415 bytes


a) And architecture working at a prestious law firm, making over $40K to start.

b) An architecture that has separate address and data busses so that instruction	fetches don't stop the processor from doing data transfers, which is
	what it's supposed to be doing!

Motorola claims that the 68030 has an "internal Harvard architecture" by which
they mean it has separate internal instruction and data caches.

dov

Re: Harvard Architecure

#3755

Author: brucek@hpsrla.HP
Date: Mon, 07 Mar 1988 21:55

16 lines
473 bytes


+-------
| Motorola claims that the 68030 has an "internal Harvard architecture" by
| which they mean it has separate internal instruction and data caches.
+-------

Ahh, those massive 256-Byte caches are really going to speed this puppy up :-)



                              Bruce Kleinman
          brucek%hpnmd@hpcea.hp.com  -or-  ...hplabs!hpnmd!brucek

              Hewlett Packard - Network Measurements Division
                          Santa Rosa, California

Re: Harvard Architecure

#3776

Author: lindsay@K.GP.CS.
Date: Wed, 09 Mar 1988 05:22

17 lines
659 bytes

In article <3460011@hpsrla.HP.COM> brucek@hpsrla.HP.COM (Bruce Kleinman)
 writes about the 68030:
>Ahh, those massive 256-Byte caches are really going to speed this puppy up :-)

Actually, it will. Remember, the CDC 6600 got a win from an "instruction
stack" of 480 bits !

Plus, the two caches access in parallel (versus the one cache of the 68020).
Plus, the caches now take one clock (versus 2 on the 68020).
Plus, the caches now have burst refill (if the board designer supports it,
of course.)

All in all, a clear improvement. I don't hear any suggestions as to a better
use for the silicon.
--
	Don		lindsay@k.gp.cs.cmu.edu    CMU Computer Science

Re: Harvard Architecure

#3785

Author: bcase@Apple.COM
Date: Wed, 09 Mar 1988 20:07

20 lines
795 bytes

In article <1071@PT.CS.CMU.EDU> lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) writes:
>In article <3460011@hpsrla.HP.COM> brucek@hpsrla.HP.COM (Bruce Kleinman)
> writes about the 68030:
>>Ahh, those massive 256-Byte caches are really going to speed this puppy up :-)

Talking about 68030.

>
>Actually, it will. Remember, the CDC 6600 got a win from an "instruction
>stack" of 480 bits !
>
>All in all, a clear improvement. I don't hear any suggestions as to a better
>use for the silicon.

A little birdie with an EE degree told me that you can expect maybe a 20%
improvement over a 68020 at the same clock rate.  An improvement, yes, but
a better use of silicon might have been some on-chip floating point.  Or
how about more pins so as to expose the harvard architecture to the external
world?

Re: Harvard Architecure

#3786

Author: bcase@Apple.COM
Date: Wed, 09 Mar 1988 20:14

8 lines
375 bytes

In article <7614@apple.Apple.Com> bcase@apple.UUCP (Brian Case) writes:
>A little birdie with an EE degree told me that you can expect maybe a 20%
>improvement over a 68020 at the same clock rate.

Oops, I should have said that the little birdie also showed me the system
running real stuff.  That is one amazing little bird (has trouble holding
the soldering iron though).

Re: Harvard Architecure

#3808

Author: brucek@hpsrla.HP
Date: Thu, 10 Mar 1988 20:09

39 lines
1726 bytes

+-------
| >Ahh, those massive 256-Byte caches are really going to speed this
| > puppy up :-)
|
| Actually, it will. Remember, the CDC 6600 got a win from an "instruction
| stack" of 480 bits !
|
| Plus, the two caches access in parallel (versus the one cache of the 68020).
| Plus, the caches now take one clock (versus 2 on the 68020).
| Plus, the caches now have burst refill (if the board designer supports it,
| of course.)
|
| All in all, a clear improvement. I don't hear any suggestions as to a better
| use for the silicon.
+-------

All in all a clear improvement?  Over the '020, perhaps ...
The 256-byte data cache is of questionable value, as the miss-rate will be
fairly high.  I would supply numbers for "fairly high," but I can't seem to
find any miss-rate data for caches of smaller than 1 Kbyte.  Furthermore, the
restricted implementation of the data cache makes it useless in multi-processor
systems as well as systems with DMA.  The data cache has no facilities for
coherency.  Solution - disable the data cache or flush, flush away.

Suggestions as to a better use for the silicon?  OK ...
Expand the I-cache.  I suspect that loosing the D-cache could make room for
a 1 Kbyte I-cache.  This would offer honest improvements in almost all systems.
High-perf machines can surround the '030 with an hefty sys-cache for
increased performance.  Memory bandwidth problems, you say?  Bring the
Harvard architecture out to the pins - which is exactly what Motorola did
for the 88000.


                              Bruce Kleinman
          brucek%hpnmd@hpcea.hp.com  -or-  ...hplabs!hpnmd!brucek

              Hewlett Packard - Network Measurements Division
                          Santa Rosa, California

CPU chip cache sizes, was Re: Harvard Architecure

#3822

Author: grenley@nsc.nsc.
Date: Fri, 11 Mar 1988 20:29

29 lines
929 bytes

In article <1071@PT.CS.CMU.EDU> lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) writes:
>In article <3460011@hpsrla.HP.COM> brucek@hpsrla.HP.COM (Bruce Kleinman)
> writes about the 68030:
>>Ahh, those massive 256-Byte caches are really going to speed this puppy up :-)

I'm sure they will.  Heaven knows it needs it...

>Plus, the two caches access in parallel (versus the one cache of the 68020).

As do the two caches on the 32532

>Plus, the caches now take one clock (versus 2 on the 68020).

Likewise on the 532

>Plus, the caches now have burst refill (if the board designer supports it,
>of course.)

So does the 532.  We also have larger caches (512 instruction, 1K 2 way
set associative data).  I have seen the studies on hit rate vs size for
the 532; since the 030 is roughly similar architecture I expect they have
the same tradeoffs.  256 bytes is better than no bytes, but it is still
pretty small.

George Grenley
NSC

Re: CPU chip cache sizes, was Re: Harvard Architecure

#3828

Author: mash@mips.COM (J
Date: Sat, 12 Mar 1988 03:45

21 lines
1089 bytes

In article <5009@nsc.nsc.com> grenley@nsc.UUCP (George Grenley) writes:
>So does the 532.  We also have larger caches (512 instruction, 1K 2 way
>set associative data).  I have seen the studies on hit rate vs size for
>the 532; since the 030 is roughly similar architecture I expect they have
>the same tradeoffs.  256 bytes is better than no bytes, but it is still
>pretty small.

I recall there was speculation when the 68030 was announced that the
D-cache might actually cost you performance in general applications,
and that people would end up turning it off [unlike the I-cache,
where even a small cache is almost always useful].
However, I've seen no data published one way or another on this yet,
and I don't have any.
Do you (or anybody else) have any good data on a 256-byte cache with
16 16-byte lines? (i.e., the 68030 D-cache)
--
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

Re: CPU chip cache sizes, was Re: Harvard Architecure

#3851

Author: north@Apple.COM
Date: Mon, 14 Mar 1988 17:50

29 lines
1843 bytes

In article <1853@winchester.mips.COM> mash@winchester.UUCP (John Mashey) writes:
>I recall there was speculation when the 68030 was announced that the
>D-cache might actually cost you performance in general applications,
>Do you (or anybody else) have any good data on a 256-byte cache with
>16 16-byte lines? (i.e., the 68030 D-cache)

Having had some '030 experience as of late, I have found that (in real working
hardware) the D-cache is always a 'win', even though it is small by most
standards.  I have yet to find a benchmark (Dhrystone, any others of the small
integer class) or some real application code in which the performance is less
when the cache is enabled than disabled.  The typical performance improvement
ranges from a low of 5% to a high of 25%, 'average' for larger applications
appears to be about 20%.

In looking at the cache organization (direct mapped, 16-16 byte lines) one could
construct a sequence of references (accessing data locations exactly 256 bytes
apart, for example) that causes thrashing in particular cache lines.  This is a
problem in all direct mapped caches, and one would think it to be especially
severe with such a few number of entries (16 in the '030s case).  I suspect
this is one reason for the relatively low performance improvement figures; the
other is that the cache is just too small to be 'really' useful except in limited
situations.  Two that come immediately to mind are pushing stack arguments that
are then accessed relatively soon, and accesses to local stack storage.

Don North   -----   Apple Computer, Inc.   -----   Advanced Technology Group
UUCP: {voder,nsc,dual,sun}!apple!north                CSNET: north@apple.com
{{ Facts are facts,  but any opinions expressed are my own,  and *do not* }}
{{ represent any viewpoint, official or otherwise, of Apple Computer, Inc.}}

Re: CPU chip cache sizes, was Re: Harvard Architecure

#3871

Author: lindsay@K.GP.CS.
Date: Wed, 16 Mar 1988 18:31

29 lines
1515 bytes

In article <7672@apple.Apple.Com> north@apple.UUCP (Donald N. North) writes:
>Having had some '030 experience as of late, I have found that (in real working
>hardware) the D-cache is always a 'win', even though it is small by most
>standards.  I have yet to find a benchmark (Dhrystone, any others of the small
>integer class) or some real application code in which the performance is less
>when the cache is enabled than disabled.  The typical performance improvement
>ranges from a low of 5% to a high of 25%, 'average' for larger applications
>appears to be about 20%.

You didn't mention how many wait states on a memory access. Also,
you didn't mention if the board supports burst-fill.

I assume that the 20% is for a hot board. I would expect that the cache gets
more useful as boards get slower. In particular, the 68030 should be much
more useful than a 68020 when given a slow 8-bit-wide memory - i.e. a
minimum configuration.

I was recently surprised to learn that 68020 minimum configurations weren't
just showing up in minimum-cost systems. Apparently, some are embedded in
other systems, doing the sort of thing that an 8-bitter could hack (like,
hardware diagnostics). I assume that the 68030 will show up eventually in
this role.

Of course, 68020's have also been used as IO controllers and the like.  Does
anyone have insight into the minimum/maximum aspects of these uses, or the
likelihood of SPARC/MIPS/etc pushing into these roles ?
--
	Don		lindsay@k.gp.cs.cmu.edu    CMU Computer Science

Re: Harvard Architecure

#3893

Author: alan@pdn.UUCP (A
Date: Fri, 18 Mar 1988 22:02

30 lines
1189 bytes

In article <7614@apple.Apple.Com> bcase@apple.UUCP (Brian Case) writes:
/In article <1071@PT.CS.CMU.EDU> lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) writes:
/>In article <3460011@hpsrla.HP.COM> brucek@hpsrla.HP.COM (Bruce Kleinman)
/> writes about the 68030:
/>>Ahh, those massive 256-Byte caches are really going to speed this puppy up :-)
/
/Talking about 68030.
/
/>
/>Actually, it will. Remember, the CDC 6600 got a win from an "instruction
/>stack" of 480 bits !
/>
/>All in all, a clear improvement. I don't hear any suggestions as to a better
/>use for the silicon.
/
/A little birdie with an EE degree told me that you can expect maybe a 20%
/improvement over a 68020 at the same clock rate.  An improvement, yes, but
/a better use of silicon might have been some on-chip floating point.  Or
/how about more pins so as to expose the harvard architecture to the external
/world?

The 68020 **CONSISTENTLY** benchmarks twice as fast with the instruction
cache turned on (compared to its being turned off).  Apparently the
'030 gets **AT LEAST** a 30% performance boost from turning on the data
cache (so I have been told by those who have benchmarked one).

Enough said.

--alan@pdn

Re: Harvard Architecure

#3899

Author: mash@mips.COM (J
Date: Sat, 19 Mar 1988 21:41

34 lines
1697 bytes

In article <2594@pdn.UUCP> alan@pdn.UUCP (0000-Alan Lovejoy) writes:

>The 68020 **CONSISTENTLY** benchmarks twice as fast with the instruction
>cache turned on (compared to its being turned off).  Apparently the
>'030 gets **AT LEAST** a 30% performance boost from turning on the data
>cache (so I have been told by those who have benchmarked one).
>
>Enough said.

1) Even small I-caches are almost always useful: the question that started
this all was whether or not small D-caches were useful, and if so, how much,
and under what circumstances.

2)One would expect (rightly or wrongly) that overall system design would heavily
influence the benefit level of a small on-chip D-cache. I.e., one would
expect that, for example, turning the D-cache on would help more in
a Sun-3/160 - style design (no external cache) than in a /260 design
(well-designed, fast external cache). (Expectations could be wrong, but...)

3) Data would help: whenever 68030 systems become widely available,
and especially if there are convenient ways to turn the caches on/off,
people could do comp.arch a large service by:
	a) running large, realistic benchmarks
	b) reporting the results
	c) reporting the clock speed and overall memory system configuration.
Until then, all we've got to go on is indirect reports, having no idea what
sorts of benchmarks and configurations are being tested. Alan: can you
possibly offer more fo the details, or is it still proprietary?
--
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

Re: Harvard Architecure (really 680x0 caches)

#3903

Author: radford@calgary.
Date: Sun, 20 Mar 1988 19:24

26 lines
1186 bytes

In article <2594@pdn.UUCP>, alan@pdn.UUCP (Alan Lovejoy) writes:

> The 68020 **CONSISTENTLY** benchmarks twice as fast with the instruction
> cache turned on (compared to its being turned off).  Apparently the
> '030 gets **AT LEAST** a 30% performance boost from turning on the data
> cache (so I have been told by those who have benchmarked one).
>
> Enough said.

This is a bit hard to believe, seeing as the 68020 takes two cycles
to access a word from cache, and only three to access the same word
from external memory (if my memory serves me right). Of course, the
external memory might be slow, and require wait states to be added
to the three cycles. If you're willing to go for that, however, I'm
sure I could build a system in which turning on the cache speeds
things up by a factor of ten. (Well, actually, I'm not sure I could,
not having much experience with a soldering iron, but you know
what I mean... :-)

What's needed are data on: (1) the benefit of the '030's data
cache assuming zero wait states (and a 32-bit bus), (2) the
benefit for various other memory configurations, (3) the overall
benefit in systems typical of various applications.

    Radford Neal

Re: Harvard Architecure

#3908

Author: pf@diab.UUCP (Pe
Date: Mon, 21 Mar 1988 07:41

11 lines
461 bytes

In article <2594@pdn.UUCP> alan@pdn.UUCP (0000-Alan Lovejoy) writes:
>The 68020 **CONSISTENTLY** benchmarks twice as fast with the instruction
>cache turned on (compared to its being turned off).  ...............
>
>--alan@pdn

From our experience tests has shown that turning off the internal cache in the
"020" results in a 20% slowdown if the external memory don't have wait states.

So Your figure must be from a system with slow external memory, right ??

Re: Harvard Architecure

#3943

Author: chow@batcomputer
Date: Wed, 23 Mar 1988 19:24

25 lines
1255 bytes

In article <373@ma.diab.UUCP> pf@ma.UUCP (Per Fogelstr|m) writes:
|In article <2594@pdn.UUCP> alan@pdn.UUCP (0000-Alan Lovejoy) writes:
||The 68020 **CONSISTENTLY** benchmarks twice as fast with the instruction
||cache turned on (compared to its being turned off).  ...............
||
||--alan@pdn
|
|From our experience tests has shown that turning off the internal cache in the
|"020" results in a 20% slowdown if the external memory don't have wait states.
|
|So Your figure must be from a system with slow external memory, right ??

My own tests have also shown that on a Mac II, the 68020 instruction cache
results in a 20% performance change.  (I think the Mac II has 1 wait state.)


Christopher Chow
/---------------------------------------------------------------------------\
| Internet:  chow@tcgould.tn.cornell.edu (128.84.248.35 or 128.84.253.35)   |
| Usenet:    ...{uw-beaver|ihnp4|decvax|vax135}!cornell!batcomputer!chow    |
| Bitnet:    chow@crnlthry.bitnet                                           |
| Phone:     1-607-253-6699   Address: 7122 N. Campus 7, Ithaca, NY 14853   |
| Delphi:    chow2            PAN:  chow                                    |
\---------------------------------------------------------------------------/

Re: Harvard Architecure

#3986

Author: alan@pdn.UUCP (A
Date: Fri, 25 Mar 1988 14:33

28 lines
1339 bytes

In article <373@ma.diab.UUCP> pf@ma.UUCP (Per Fogelstr|m) writes:
>In article <2594@pdn.UUCP> alan@pdn.UUCP (0000-Alan Lovejoy) writes:
>>The 68020 **CONSISTENTLY** benchmarks twice as fast with the instruction
>>cache turned on (compared to its being turned off).  ...............
>>
>>--alan@pdn
>
>From our experience tests has shown that turning off the internal cache in the
>"020" results in a 20% slowdown if the external memory don't have wait states.
>
>So Your figure must be from a system with slow external memory, right ??

Absolutely correct.  150ns DRAMs to be precise.  Using 45ns SRAMs, the
figure is closer to the 20% you quoted (my source gets a 30% difference with his
benchmarks and his compiler).  Somehow only the 150ns DRAM figure stuck
in my mind (perhaps because my main interest is personal computers where
45ns SRAMs are too expensive).  The numbers from the '030 also are for
relatively slow external memory and no external cache.  Sorry, but I
can't be more specific than that.

But an interesting point is raised here:  what's good for a $50,000
workstation may not be so good for a $5000 pc, and vice-versa.  What's
good for running UNIX&C may not be good for running Smalltalk, and
vice versa.  This is not new information, but the discussion in this
group tends to lose sight of it at times.

--alan@pdn

🚀 go-pugleaf

Thread View: comp.arch

Harvard Architecure

Re: Harvard Architecure

Re: Harvard Architecure

Re: Harvard Architecure

Re: Harvard Architecure

Re: Harvard Architecure

CPU chip cache sizes, was Re: Harvard Architecure

Re: CPU chip cache sizes, was Re: Harvard Architecure

Re: CPU chip cache sizes, was Re: Harvard Architecure

Re: CPU chip cache sizes, was Re: Harvard Architecure

Re: Harvard Architecure

Re: Harvard Architecure

Re: Harvard Architecure (really 680x0 caches)

Re: Harvard Architecure

Re: Harvard Architecure

Re: Harvard Architecure

Thread Navigation